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METHODS FOR DIAGNOSING, PREVENTING, AND TREATING 
DEVELOPMENTAL DISORDERS DUE TO A COMBINATION OF 
GENETIC AND ENVIRONMENTAL FACTORS 

CROSS-REFERENCE TO RELATED APPLICATIONS 

5 The present application is a non-provisional application claiming the priority of 

copending provisional U.S. Serial No. 60/136,198 filed May 25, 1999, the disclosure 
of which is hereby incorporated by reference in its entirety. Apphcants claim the 
benefits of this application under 35 U.S.C. §1 19(e). 

FIELD OF THE INVENTION 

10 The invention relates generally to novel methods of diagnosing, preventing, and 
treating specific diseases which are caused by a combination of genetic and 
environmental factors. One such disease exemphfied is schizophrenia. 

BACKGROUND OF THE INVENTION 

The term "schizophrenia" was introduced by Bleuler in the beginning of this century 
15 to encompass a dissociation or disruption of thought processes, along with a 

dichotomy among thought, emotion, and behavior [Bleuler, Translation J. Zinkin, 
New York: International University Press (1950)]. The current definition of 
schizophrenia includes a break with reality that is usually manifested as 
hallucinations, delusions, or disruption in thought processes [Carpenter et al. Medical 
20 Progress, 330:681 -690 (1994)] . At present the nationally accepted definition for the 
diagnosis of schizophrenia is contained in Diagnostic and Statistical Manual for 
Mental Disorders, Fourth Edition, Washington, D.C (1994): American Psychiatric 
Association, hereby incorporated by reference in its entirety. 

Schizophrenia is a chnical syndrome that has a profoimd influence on pubhc health. 
25 The symptoms for schizophrenia begin early in life, and continues for most patients 
throughout their lives. An estimate of the direct and indirect costs of schizophrenia 
was thirty-three billion dollars for 1990 in the United States alone [Carpenter et al, 
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1994, supra]. Indeed, one of every forty dollars spent for total heath care 
expenditures in the United States is spent on treating schizophrenia [Rupp et al, 
Psychiatric Clin. North Am., 16:413-423 (1993)]. Furthermore, estimates have been 
made suggesting that up to 50% of the homeless American population is 
5 schizophrenic [Bachrach, In: Treating the Homeless Mentally III, Washington, D.C., 
American Psychiatric Press, 13-40, Lamb et al. ed. (1992)]. 

The genetic factors in schizophrenia, though clearly documented to be present, are not 
simple [Carpenter and Buchanan, A^. Engl. J. Med., 330:681-689 (1994); Gottesman, 
Clin. Genet., 46:116-123 (1994)]. Schizophrenia is, at least in part, a 
1 0 neurodevelopmental disorder, a birth defect in which the brain has been subtly 

damaged during development [Carpenter and Buchanan, N. Engl. J. Med., 330:681- 
689 (1994); Weinberger, ^rc/?. Gen. Psychiatry, 44:660-669 (1987); Brixey et al, J. 
Clin. Psychol, 49:447-456 (1993)]. Evidence of this damage is seen both at autopsy 
[Kovelman and Scheibel, Biol Psychiatry, 19:1601-1621 (1984); Bogerts et al, Arch. 
15 Gen. Psychiatry, 42:784-791 (1985); Jakob and Beckman, /. Neural Transm., 

65:303-326 (1986); Brown et al, Arch. Gen. Psychiatry, 43:36-42 (1986); Benes and 
Bird, Arch Gen Psychiatry, 44:608-616 (1987); Colter et al. Arch Gen Psychiatry, 
44:1023 (1987); Altshuler a/., Arch. Gen. Psychiatry, 47:1029-1034 (1990); 
Pakkenberg, Schizophr. Res., 7:95-100 (1992); Bogerts, Schizophr. Bull, 19:431-445 
20 (1993); Shapiro, Schizophr. Res., 10: 187-239 (1993)] and by neuroimaging [Teste et 
al, Br. J. Psychiatry, 153:444-459 (1988); Suddath etal. Am. J. Psychiatry, 
146:464-472 (1989); Suddath et al, N. Engl J. Med., 322:789-794 (1990); DeLisi et 
al, Biol Psychiatry, 29:159-175 (1991); Breier etal, Arch. Gen. Psychiatry, 49:921- 
926 (1992); O'Callaghan etal, J. R. Soc. Med, 85:227-231 (1992); Bogerts etal, 
25 Biol Psychiatry, 33:236-246 (1993); Andreasen et al. Science, 266:294-298 (1994)]. 
The pattern of this brain damage and the presence of minor congenital abnormalities 
point to an insult occurring during the second trimester of fetal development [Bracha 
et al, Biol Psychiatry. 30:719-725 (1991); Bracha et al. Am. J. Psychiatry, 
149:1355-1361 (1992); Grccnetal, Psychiatry Res., 53:119-127 (1994)]. 
30 Epidemiological studies have documented a season-of-birth effect by which 

schizophrenics are more frequently bom during winter and early spring than during 
other seasons [Boyd et al, Schizophr. Bull, 12:173-186 (1986); Kendell and Adams, 
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Br. J. Psychiatry, 158:758-763 (1991); O'Callaghan et al, Br. J. Psychiatry, 
158:764-769 (1991)]. Also, individuals exposed to an influenza epidemic [Mednick 
et al. Arch. Gen. Psychiatry, 45:189-192 (1988); Barr et al, Arch. Gen. Psychiatry, 
47:869-874 (1990); O'Callaghan et al, Lancet., 337:1248-1250 (1991); Murray et 

5 al, J. Psychiatr. Res., 26:225-235 (1992); Adams et al, Br. J. Psychiatry, 163:522- 
534 (1993)] or famine [Susser and Lin, Arch. Gen. Psychiatry, 49:983-988 (1992)] 
during their second trimester of fetal development have increased risk of later 
developing schizophrenia, according to some studies but not others [Kendell, Arch. 
Gen. Psychiatry, 46:878-882 (1989); Crow and Done, Br. J. Psychiatry, 161:390-393 

1 0 (1 992)] . This has suggested that an environmental effect such as dietary deficiency, 
virus infection [Kirch, Schizophr. Bull, 19:355-370 (1993)], vitamin deficiency, or 
effect of cold weather may be acting during fetal development. 

Linkage mapping studies in schizophrenia have been difficult. Recently, some 
studies [Straub et al. Nature Genet., 11:287-293 (1995); Schwab et al. Nature 

15 Genet., 11:325-327 (1995); Moises etal, Nature Genet, 11:321-324 (1995)] have 
supported a gene locus on chromosome 6 (6p24-22, near the HLA region) as having 
an effect in schizophrenia; other studies gave little or no support to a marker in this 
region [Wang et al. Nature Genet., 10:41-46 (1995); Mowry et al, Nature Genet., 
11:233-234 (1995); Gurling etal. Nature Genet., 11:234-235 (1995); Antonarakis et 

20 al. Nature Genet., 11 :235-236 (1995)]. At best this locus appeared to be involved in 
only about 15-30% of families [Straub et al, 1995, supra]. Also, some evidence for 
loci on chromosomes 3 [Pulver et al. Am. J. Med. Genet., 60:252-260 (1995), 8 
[Pulver et al. Am. J. Med. Genet., 60:252-260 (1995); Kendler et al. Am. J. Psych. 
153:1534-1540 (1996), 9 [Coon et al, Biol Psychiatry, 34:277-289 (1993); Moises et 

25 al. Nature Genet., 11:321-324 (1995)] and 22 [Coon et al. Am. J. Med. Genet, 
54:72-79 (1994); Pulver a/., Am. J. Med. Genet., 54:3-43 (1994)]have been 
reported. In addition, two polymorphic markers very close to the gene encoding 
dihydrofolate reductase (DHFR) on chromosome 5q, D5S76 and D5S39, gave very 
high lod scores (as high as 6.49, i.e. odds of about 3 million to one in favor of genetic 

30 linkage versus chance occurrence) in 7 British and Icelandic schizophrenia families 
studied [Schwab et al, Nat Genet 11:325-327 (1997); Straub et al, Molec 
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Psychiatr. 2:148-155 (1997)]. However, this result could not be confirmed in studies 
of numerous other families. 



There could be several reasons for this difficulty. First, there may be more than one 
gene involved, (locus heterogeneity). Second, the genetic factor(s) may be common 
5 in the population (high disease allele frequency), thus diminishing the power of 
linkage studies [Terwilliger and Ott, Handbook of Human Genetic Linkage, 
Baltimore: Johns Hopkins Univ. Pr., 181 (1994)]. Third, the correct genetic model 
may be unknown [Owen, Psychol. Med., 22:289-293 (1992)]. Any or all of these 
factors could diminish the power of a linkage study sufficiently to make success very 
1 0 difficult [TerwiUiger and Ott, 1 994, supra] . 

Thus the current (developmental) model for schizophrenia is that genetic and 
environmental factors cause brain damage in a fetus that later develops schizophrenia. 
However, the genetic and environmental factors have not been identified. Also, 
extensive linkage and association studies have failed to identify genes determining 
1 5 schizophrenia. 

Indeed, schizophrenia appears to be just one of a family of developmental disorders 
whose cause has not been identified. Other such developmental disorders are defmed 
by the Diagnostic and Statistical Manual for Mental Disorders, Fourth Edition, 
Washington, D.C (1994) and include: Tourette Syndrome which is identical to 

20 Tourette 's Disorder and is a subcategory of Tic Disorders; Bipolar Disorder which is 
identical with Bipolar I Disorder or Bipolar II disorder; Autism which is identical 
with Autistic Disorder which is a subcategory of Pervasive Developmental Disorders; 
Conduct disorder which is a subcategory of Attention-Deficit and Disruptive 
Behavioral Disorders; Attention-Deficit Hyperactivity Disorder which is identical to 

25 Attention-Deficit/Hyperactivity Disorder and to Attention-Deficit/Hyperactivity 
Disorder NOS (not otherwise specified) which is also a subcategory of Attention- 
Deficit and Disruptive Behavioral Disorders; Obsessive-Compulsive Disorder which 
is a subtype of Anxiety Disorders; Chronic Multiple Tics S>iidrome which is identical 
to Chronic Motor or Vocal Tic Disorder which is a subtype of Tic Disorders; and 

30 Learning Disorders. 
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In addition Spina bifida is a developmental disorder. Spina bifida is a form of neural 
tube defect in which neural elements (spinal nerves or spinal chord) or coverings of 
the brain and spinal chord (dura mater, arachnoid mater) herniate through a midline 
defect into a cystic cavity covered completely or partially by skin. 

5 Therefore, there is a need for new methods of diagnosing individuals susceptible to 
developing a developmental disorder. In addition, there is a need for methods of 
identifying individuals susceptible to having offspring that develop a developmental 
disorder. Finally, there is a need for a method of treating such susceptible individuals 
in order to prevent and/or ameliorate the s5Tnptoms due to and/or associated with the 
10 developmental disorder. 

The citations of any reference herein should not be construed as an admission that 
such reference is available as "Prior Art" to the instant application. 

SUMMARY OF THE INVENTION 

The present invention provides methods of diagnosing, preventing and/or treating 
1 5 specific developmental disorders. Towards this end the present invention provides 
methods of identifying an individual as being genetically or environmentally 
susceptible for developing or having a developmental disorder or for having offspring 
that develop the developmental disorder. Such a developmental disorder can be 
schizophrenia, spina bifida cystica, Tourette's syndrome, bipolar illness, autism, 
20 conduct disorders, attention deficit disorder, obsessive compulsive disorder, chronic 
multiple tic syndrome and learning disorders such as dyslexia. In addition, any of the 
methods provided herein for identifying an individual as being genetically and/or 
environmentally susceptible for having or developing a developmental disorder or for 
having offspring that develop the developmental disorder can also be used in 
25 diagnosing the individual, preferably in conjunction with a clinical diagnosis. 

Therefore, the present invention provides methods of identifying an individual as 
being genetically susceptible for having or developing a developmental disorder. 
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The present invention further provides methods of identifying an individual as being 
genetically susceptible for having offspring that are susceptible for developing a 
developmental disorder. Methods of identifying an individual as being susceptible 
due to environmental factors for having or developing a developmental disorder are 
5 also provided. In addition, the present invention provides methods of identifying an 
individual as being susceptible of having offspring that are susceptible for developing 
a developmental disorder. The present invention also provides methods of identifying 
an individual as being susceptible for having or developing a developmental disorder 
due to both environmental and genetic factors. The present invention further provides 
1 0 methods of identifying an individual as being susceptible for having offspring that are 
susceptible for developing a developmental disorder 

The present invention therefore provides methods for compiling genetic reference 
datasets, environmental reference datasets and/or genetic and environmental reference 
datasets for use in determining a predicted probability for an individual of having a 
1 5 susceptibility for having or developing a developmental disorder, or for having 
offspring that develop a developmental disorder. 

In one aspect of the invention, the present invention provides methods that comprise 
generating a genetic reference dataset for use in determining the predicted probabiUty 
of an individual for having a susceptibility for having or developing a developmental 
20 disorder due to genetic factors, or for having offspring that develop a developmental 
disorder due to genetic factors. 

One such embodiment comprises collecting a biological sample firom a human 
subject. The human subject can be a diagnostic proband, a blood relative of the 
diagnostic proband, an affected proband, a blood relative of the affected proband, a 

25 control proband, and/or a blood relative of the control proband. The biological 

sample contains nucleic acids and/or proteins from the human subject. The nucleic 
acids and/or proteins from the biological sample are then analyzed resulting in a 
partial or full genotype for the alleles of the genes involved in folate, pyridoxine, 
and/or cobalamin metabolism. The partial or full genotype then forms a dataset of 

30 genetic explanatory variables for the human subject. The dataset of genetic 
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explanatory variables is then compiled from multiple human subjects into a genetic 
reference dataset. Such compilations are exempHfied in the Detailed Description and 
Examples below. 

In another aspect, the present invention provides a method that comprises generating 
5 a genetic and environmental reference dataset for use in determining the predicted 
probabiUty of an individual for having a susceptibility for having or developing a 
developmental disorder due to genetic factors and environmental factors, or for 
having offspring that develop a developmental disorder due to genetic factors and 
environmental factors. One such embodiment comprises obtaining dietary and 
1 0 epidemiological information for environmental explanatory variables for the human 
subjects and combining the environmental explanatory variables with a genetic 
reference dataset for the human subjects as described above. 

In another aspect, the present invention provides an environmental reference dataset 
for use in the determination of the predicted probability for an individual for having a 

1 5 susceptibihty for having or developing a developmental disorder due to 

envirormiental factors, or for having offspring that develop a developmental disorder 
due to environmental factors One such embodiment comprises obtaining dietary and 
epidemiological information for environmental explanatory variables for a human 
subject. The human subject can be a diagnostic proband, a blood relative of the 

20 diagnostic proband, an affected proband, a blood relative of the affected proband, a 
control proband, or a blood relative of the control proband. The dataset of 
environmental explanatory variables is then compiled from multiple human subjects 
into an environmental reference dataset for the human subjects. 

The developmental disorder forming the basis of the reference datasets of the present 
25 invention can be schizophrenia, or spina bifida cystica, or Tourette's syndrome, or 
dyslexia, or conduct disorder, or attention-deficit hyperactivity disorder, or bipolar 
illness, or autism, or chronic multiple tic syndrome or obsessive-compulsive disorder, 
or like disorders. A blood relative is preferably the mother of the individual, a 
sibhng, the father or a grandparent of the individual. When the reference dataset is 
30 for use in the determination of the predicted probability for an individual of having a 



susceptibility for having offspring that develop a developmental disorder, the 
individual is preferably a pregnant woman. The reference datasets of the present 
invention are themselves part of the present invention. 



The present invention further provides methods of estimating the genetic 
5 susceptibility of an individual to have or to develop a developmental disorder, or to 
have offspring that develop a developmental disorder. In one such embodiment the 
method comprises collecting a biological sample from a participant (or participants) 
who is either the individual or a blood relative of the individual. The biological 
sample contains nucleic acids and/or proteins of the participant. The analysis of the 

10 nucleic acids and/or proteins from the biological sample yield a partial or frill 

genotype for the alleles of the genes involved in folate, p5aidoxine, and/or cobalamin 
metabohsm. The partial or frill genotype forms a dataset of genetic explanatory 
variables for the participants. The dataset of genetic explanatory variables obtained 
are added to a genetic reference dataset forming a combined genetic dataset. A model 

15 is then formulated comprising the genetic explanatory variables obtained from the 

participants and the combined genetic dataset is analyzed. A predicted probability for 
the individual for having and/or developing a developmental disorder and/or having 
offspring that develop a developmental disorder is then determined. The genetic 
susceptibility of an individual to have or to develop a developmental disorder and/or 

20 have offspring that develop a developmental disorder is estimated. In a preferred 
embodiment, analyzing the combined genetic dataset is performed by binary linear 
regression. In a more preferred embodiment, the binary linear regression is 
performed with the SAS system. In another preferred embodiment, the model is 
modified by adding or subtracting one or more genetic explanatory variables and the 

25 combined genetic dataset is re-analyzed, preferably by binary logistic regression. In 
this case a model is chosen that best fits the data. This can be accomplished by 
testing the model for goodness of fit. 

The present invention also provides methods of estimating the genetic and 
environmental susceptibihty of an individual to have or to develop a developmental 
30 disorder and/or for having offspring that develop a developmental disorder. One such 
embodiment comprises collecting a biological sample from one or more participants. 
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Again, the participant is either the individual or a blood relative of the individual. 
The biological sample contains nucleic acids and/or proteins of the participant. The 
nucleic acids and/or proteins from the biological sample are analyzed resulting in a 
partial or full genotype for the alleles of the genes involved in folate, pyridoxine, 
5 and/or cobalamin metabolism. The partial or full genotype forms a dataset of genetic 
explanatory variables for the participant. Dietary and epidemiological information 
for environmental explanatory variables for the participant(s) are also obtained which 
are used to form a dataset of environmental explanatory variables for the 
participant(s). The datasets of genetic explanatory variables and the dataset of 
10 environmental explanatory variables are added to a genetic and environmental 

reference dataset forming a combined genetic and environmental dataset. A model is 
formulated comprising the genetic and environmental explanatory variables obtained 
from the participant(s). The combined genetic and environmental dataset is then 
analyzed and a predicted probability for the individual for having and/or developing a 
1 5 developmental disorder and/or for having offspring that develop a developmental 

disorder is determined. The genetic and environmental susceptibility of an individual 
to have or to develop a developmental disorder and/or have offspring that develop a 
developmental disorder is estimated. In a preferred embodiment, analyzing the 
combined genetic and environmental dataset is performed by binary linear regression. 
20 In a more preferred embodiment the binary hnear regression is performed with the 
SAS system. In another preferred embodiment the model is modified by adding or 
subtracting one or more genetic and/or envirormiental explanatory variables and the 
combined genetic and environmental dataset is re-analyzed preferably, by binary 
logistic regression. In this case a model is chosen that best fits the data. This can be 
25 accomplished by testing the model for goodness of fit. 

For any of these methods, the developmental disorder can be schizophrenia, spina 
bifida cystica, Tourette's syndrome, bipolar illness, autism, conduct disorder, 
attention deficit hyperactivity disorder, obsessive compulsive disorder, chronic 
multiple tic syndrome and learning disorders such as dyslexia. 

30 In a particular embodiment, the individual is suspected of being genetically 

susceptible of having or for developing the developmental disorder and/or of being 
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genetically susceptible of having offspring that develop the developmental disorder. 
In a preferred embodiment of this type, the individual is suspected of being 
genetically susceptible for having or for developing the developmental disorder 
and/or of being genetically susceptible of having offspring that develop the 
developmental disorder because a blood relative has the developmental disorder. In 
one such embodiment the blood relative is a parent, a sibling, or a grandparent. In a 
preferred embodiment the blood relative is the mother of the individual. In a 
particular embodiment in which the individual is suspected of being genetically 
susceptible of having offspring that develop the developmental disorder, the 
individual is a pregnant woman. In another such embodiment the individual is the 
mate of the pregnant woman. In a particular embodiment exemplified below, the 
developmental disorder is schizophrenia. 

Since the availability of the data regarding the genetic and envirorraiental explanatory 
factors can vary in separate determinations, variations in the explanatory factors used 
is clearly envisioned by the present invention. 

The present invention fiuther provides methods of lowering the risk of a pregnant 
woman to have a child that will develop a developmental disorder. One such 
embodiment comprises administering methylfolate, cobalamin or pyridoxine to the 
pregnant woman and/or fetus, which lowers the risk of the pregnant woman to give 
birth to a child with a developmental disorder. In a particular embodiment of this 
type, the pregnant woman had been previously determined to be susceptible of having 
offspring that develop a developmental disorder by a method disclosed herein. The 
present invention further provides a method of determining if any treatment is 
advisable for a pregnant woman that is genetically susceptible to having offspring that 
develop a developmental disorder which comprises determining the concentration of 
a risk factor from a tissue sample or body fluid from the pregnant woman. When the 
concentration of the risk factor is statistically above or below an accepted normal 
range, treatment is advisable. 

The present invention further provides methods of determining if any treatment is 
advisable for a pregnant woman who has been determined to be susceptible to having 
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offspring that develop a developmental disorder. One such embodiment comprises 
determining the concentration of a risk factor from a tissue sample or body fluid from 
the pregnant woman. When the concentration of the risk factor is statistically above 
or below an accepted normal range, treatment is advisable. In a particular 
5 embodiment of this type, the pregnant woman had been previously determined to be 
susceptible of having offspring that develop a developmental disorder by a method 
disclosed herein. 

Methods of monitoring the effect of the adminisfration of methylfolate, cobalamin or 
pyridoxine to the pregnant woman who has been determined to be susceptible to 

10 having offspring that develop a developmental disorder are also included in the 

present invention. One such embodiment comprises determining the concentration of 
a risk factor from a tissue sample or body fluid from the pregnant woman. When the 
concentration of the risk factor is statistically within an accepted normal range, the 
freatment is deemed effective. In a particular embodiment of this type, the pregnant 

1 5 woman had been previously determined to be susceptible of having offspring that 
develop a developmental disorder by a method disclosed herein. The risk factor can 
be any substance and/or metabolite linked to folate and/or cobalamin and/or 
pyridoxine metabolism. In one embodiment, the risk factor is homocysteine. In yet 
another embodiment, the risk factor is folate. In still another embodiment, the risk 

20 factor is cobalamin. 

The present invention also provides a method of treating an asymptomatic individual 
determined to be susceptible for developing a developmental disorder comprising 
administering methylfolate, cobalamin and/or pyridoxine. In a particular embodiment 
of this type, the asymptomatic individual had been previously determined to be 
25 susceptible of developing a developmental disorder by a method disclosed herein. 



The DNA samples from the persons tested may be obtained from any source 
including blood, a tissue sample, amniotic fluid, a chorionic villus sampling, 
cerebrospinal fluid, and urine. 
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The present invention includes but is not limited to the examples of proteins encoded 
by genes involved in folate, cobalamin and pyridoxine metabolism compiled in 
Tables 2-7 in the Detailed Description of the Invention, below. For certain genes 
nucleic acid and/or amino acid sequence data is also provided. These genes and 
related sequence data are solely intended as examples of genes that are suitable to be 
used in the methods described herein. Such sequence data can be used for carrying 
out the genetic analysis of the present invention. However, the present invention is 
not intended to be hmited in any way to such hsts of proteins or the related sequence 
data. 

It is further contemplated by the present invention to provide methods that include the 
testing for a genetic mutations in individual genes involved in folate and cobalamin 
metabolism and/or in individual combinations of such genes (e.g., 
methylenetetrahydrofolate reductase gene and methionine synthase). In addition, all 
possible combinatorials, and permutations of such genes including a constellation 
comprising all of the genes involved in folate, pyridoxine, and cobalamin metabolism 
is envisioned by the present invention. Alternatively, a constellation of genes in 
which any one or more genes can be excluded from those tested is also contemplated 
by the present invention (for example, a given constellation of genes can include 
genes encoding all of the proteins in Table 2 and 4 except the folate receptor 2-like 
protein). Thus all of such possible constellations are envisioned by, and are therefore 
part of the present invention. 

The present invention also provides DNA polymorphisms that can be used as genetic 
explanatory factors in the present invention. One such embodiment is a nucleic acid 
encoding a genetic variant of human dihydrofolate reductase comprising a nucleotide 
sequence having a 19 base-pair deletion spanning nucleotides 540 to 558 of the 
nucleotide sequence of SEQ ID NO:41 . In a preferred embodiment the nucleic acid 
has the nucleotide sequence of SEQ ID NO:42. 

The present invention also includes primers. One such embodiment is a PGR primer 
that can be used to distinguish SEQ ID NO:42 from SEQ ID NO:41. Another 
embodiment is a PGR primer that can be used to distinguish SEQ ID NO:42 from 



13 

SEQ ID NO:45. These primers are useful for identifying the 19 base-pair deletion 
spanning nucleotides 540 to 558 of the nucleotide sequence of SEQ ID N0:41 {see 
Example 2). In a particular embodiment, the PCR primer comprises 8 to 100 and 
preferably 10 to 50 consecutive nucleotides from the nucleotide sequence of SEQ ID 
5 NO:41 . In another embodiment the PCR primer comprises 8 to 1 00 and preferably 1 0 
to 50 consecutive nucleotides from the nucleotide sequence of the complementary 
strand of SEQ ID N0:41 . In still another embodiment the PCR primer comprises 8 to 
100 and preferably 10 to 50 consecutive nucleotides from the nucleotide sequence of 
SEQ ID NO:42. In yet another embodiment the PCR primer comprises 8 to 100 and 

10 preferably 10 to 50 consecutive nucleotides from the nucleotide sequence of the 
complementary strand of SEQ ID NO:42. In still another embodiment the PCR 
primer comprises 8 to 100 and preferably 10 to 50 consecutive nucleotides from the 
nucleotide sequence of SEQ ID NO:45. In yet another embodiment the PCR primer 
comprises 8 to 100 and preferably 10 to 50 consecutive nucleotides from the 

1 5 nucleotide sequence of the complementary strand of SEQ ID NO:45. 

In a particular embodiment the PCR primer comprises 8 to 100 and preferably 10 to 
50 consecutive nucleotides from nucleotides 350 to 530 of SEQ ID NO:41. In a 
preferred embodiment of this type, the PCR primer has the nucleotide sequence of 
CTAAACTGCATCGTCGCTGTG (SEQ ID NO:38). In another particular 

20 embodiment the PCR primer comprises 8 to 1 00 and preferably 1 0 to 50 consecutive 
nucleotides from the complementary sfrand of nucleotides 550 to 850 of SEQ ID 
NO:41 . In preferred embodiment of this type, the PCR primer comprises 8 to 100 
and preferably 10 to 50 consecutive nucleotides from the complementary strand of 
nucleotides 570 to 690 of SEQ ID N0:41. In a particular embodiment, the PCR 

25 primer has the nucleotide sequence of AAAAGGGGAATCCAGTCGG (SEQ ID 
NO:39). 

The present invention also provides a nucleic acid that hybridizes under standard 
hybridization conditions to the nucleotide sequence ACCTGGGCGGGACGCGCCA 
(SEQ ID NO;40). In another embodiment the nucleic acid hybridizes under standard 
30 hybridization conditions to the nucleotide sequence complementary to SEQ ID 
NO -.40. In yet another embodiment the nucleic acid hybridizes under standard 
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hybridization conditions to the nucleotide sequence ACCTGGGCGGGACGCGCC 
(SEQ ID NO:46). In yet another embodiment the nucleic acid hybridizes under 
standard hybridization conditions to the nucleotide sequence complementary to SEQ 
ID NO:46. In a particular embodiment the nucleic acid consists of 9 to 96 
5 nucleotides. In another embodiment the nucleic acid consists of 12 to 48 nucleotides. 
In still another embodiment the nucleic acid consists of 15 to 36 nucleotides. In a 
preferred embodiment the nucleic acid consists of 17 to 20 nucleotides. 

The present invention also provides a nucleic acid that hybridizes to the nucleotide 
sequence of SEQ ID NO:41, but not to the nucleotide sequence of SEQ ID NO:42 

10 when the hybridization is performed under identical conditions. In a particular 
embodiment the nucleic acid comprises the nucleotide sequence of 
CCCACGGTCGGGGTACCTGGGCGGGACGCGCCAGGCCGACTCCCGGCGA 
(SEQ ID NO:29). The present invention further provides a nucleic acid that 
hybridizes to the nucleotide sequence of SEQ ID NO:42, but not to the nucleotide 

1 5 sequence of SEQ ID NO:41 when the hybridization is performed under identical 
conditions. In a particular embodiment the nucleic acid comprises the nucleotide 
sequence of CCCACGGTCGGGGTGGCCGACTCCCGGCGA (SEQ ID NO:37). 

In a related embodiment the present invention provides an isolated nucleic acid that 
hybridizes to the complementary strand of the nucleotide sequence of SEQ ID NO:42, 

20 but not to the complementary strand of the nucleotide sequence of SEQ ID N0:41 
when the hybridization is performed under identical conditions. In still another 
embodiment the nucleic acid hybridizes to the nucleotide sequence of SEQ ID 
NO:41, but not to the nucleotide sequence of SEQ ID NO:42 when the hybridization 
is performed under identical conditions. In still another embodiment the nucleic acid 

25 hybridizes to the complementary strand of the nucleotide sequence of SEQ ED N0:41, 
but not to the complementary strand of the nucleotide sequence of SEQ ID NO: 42 
when the hybridization is performed under identical conditions. 

The present invention also provides a nucleic acid that hybridizes to the nucleotide 
sequence of SEQ ID NO:42, but not to the nucleotide sequence of SEQ ID NO:45 
30 when the hybridization is performed under identical conditions. In a related 
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embodiment the present invention provides an isolated nucleic acid that hybridizes to 
the complementary strand of the nucleotide sequence of SEQ ID NO:42, but not to 
the complementary strand of the nucleotide sequence of SEQ ID NO:45, when the 
hybridization is performed imder identical conditions. In still another embodiment 
5 the nucleic acid hybridizes to the nucleotide sequence of SEQ ID NO:45, but not to 
the nucleotide sequence of SEQ ID NO:42 when the hybridization is performed under 
identical conditions. In still another embodiment the nucleic acid hybridizes to the 
complementary strand of the nucleotide sequence of SEQ ID NO:45, but not to the 
complementary strand of the nucleotide sequence of SEQ ID NO:42 when the 
1 0 hybridization is performed under identical conditions. 

The present invention also provides for the use of the nucleic acids of the present 
invention (as well as other nucleic acids which can be used to identify DNA 
polymorphisms in the alleles of the genes involved in folate, pyridoxine, and/or 
cobalamin metabolism) in the methods of the present invention for identifying, 
1 5 diagnosing, preventing and/or treating individuals. 

In methods of estimating the susceptibility due to genetic or genetic and 
environmental factors for an individual to have or to develop a developmental 
disorder or to have offspring that develop a developmental disorder, and for the 
corresponding methods of generating genetic, or genetic and environmental reference 

20 datasets, the present invention provides a step of analyzing nucleic acids and/or 

proteins from biological samples. In one particular embodiment, the assaying for the 
presence of the genetic variant of human dihydrofolate reductase having a nucleotide 
sequence with a 19 base-pair deletion spanning nucleotides 540 to 558 of the 
nucleotide sequence of SEQ ID NO:41 is included as part of this analysis. This 

25 genetic variant of human dihydrofolate reductase becomes a genetic explanatory 
variable. 



Determining if the biological sample contains the genetic variant of human 
dihydrofolate reductase having a nucleotide sequence with a 19 base-pair deletion 
spanning nucleotides 540 to 558 of the nucleotide sequence of SEQ ID NO:41 can be 
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performed by any appropriate method including PCR, special PCR, RT PGR, RFLP 
analysis, SSCP, and FISH. 

In addition, all of the nucleic acids of the present invention including cDNA or 
genomic DNA can be placed into expression vectors operably associated with an 
5 expression control sequence. Alternatively, when the nucleic acid is part of an 

expression control sequence, the nucleic acid and/or the expression control sequence 
can be placed into an expression vector to control the expression of a coding 
sequence, such as a reporter gene. Such expression vectors can then be placed into 
either eukaryotic or prokaryotic host cells and expressed. The host cells comprising 
10 the expression vectors are also part of the present invention. In addition, when the 
nucleic acid includes a coding sequence or a part of a coding sequence, the present 
invention includes methods of purifying the gene products from the coding sequence 
or part thereof, and the purified gene products themselves. 

Accordingly, it is a principal object of the present invention to provide a method for 
1 5 identifying an individual that is genetically inclined to develop a developmental 
disorder or disease. 

It is a further object of the present invention to provide a method for identifying an 
individual that is genetically inclined to develop schizophrenia. 

It is a further object of the present invention to provide a method for identifying an 
20 individual that is genetically inclined to have offspring having a developmental 
disorder. 

It is a further object of the present invention to provide a method of diagnosing 
schizophrenia. 

It is a further object of the present invention to provide a method of treating 
25 developmental disorders such as schizophrenia. 
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It is a further object of the present invention to provide a method for monitoring the 
treatment of the developmental disorder. 

It is a further object of the present invention to provide a method for ameliorating the 
effect of a defect in folate, pyridoxine or cobalamin metabolism on a fetus due to the 
5 genetic or environmental status of a pregnant woman. 

It is a further object of the present invention to provide a method of treating a patient 
who is genetically inclined to develop a developmental disorder such as 
schizophrenia. 

It is a further object of the present invention to provide a method of overcoming a 
1 0 nutritional lack of folate, cobalamin or pyridoxine of a pregnant woman to prevent the 
development of the corresponding fetus developing a developmental disorder. 

Other objects and advantages will become apparent to those skilled in the art Irom a 
review of the ensuing description. 

These and other aspects of the present invention will be better appreciated by 
15 reference to the following drawings and Detailed Description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows primers for PGR amplification of the dihydrofolate reductase (DHFR) 
deletion polymorphism region. 

Figure 2 shows the genotypes of the DHFR 19 basepair deletion by non-denaturing 
20 polyacrylamide gel electrophoresis. Lanes 1 and 2 show genotypes 1,1 . Lanes 3 and 4 
show genotypes 1, 2. Lanes 5 and 6 show genotypes 2,2. Lane 7 shows phiX174 RF 
DNA/Haelll size markers from BRL Life Technologies. 

Figure 3 shows the sequences of PGR amplification products in the Region of the 
DHFR polymorphism region. * is explained in Text, see Example 2. 
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Figure 4A is a nucleotide sequence of the wild type human DHFR, (SEQ ED N0:41) 
from Yang et al, J. Mol. Biol. 176:169-187 (1984), GeneBank accession no: X00855. 
The start codon is in bold. Figure 4B is the same nucleotide sequence as that of 
Figure 4A except the deletion of the 19 nucleotides due to the DHFR deletion 
5 polymorphism, (SEQ ID NO:42). 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention in its broadest embodiment provides a method of diagnosing, 
preventing and/or treating specific physiological/developmental disorders. Such 
physiological/developmental disorders include schizophrenia, spina bifida cystica, 
10 Tourette's syndrome, bipolar illness, autism, conduct disorders, attention deficit 

disorder, obsessive compulsive disorder, chronic multiple tic syndrome and learning 
disorders such as dyslexia. 

A particular aspect of the present invention provides methodology for diagnosing, 
preventing and/or treating a developmental disorder such as schizophrenia. Such 

1 5 methodology is premised on the correlation between abnormaUties in folate, 

cobalamin, and/or pyridoxine metabolism in an individual and/or the mother of an 
individual and the occurrence of the developmental disorder, e.g., schizophrenia in 
the individual. Further, the present invention provides a framework {i.e., the gene- 
teratogen model, and the DNA Polymorphism-Diet-Cofactor-Development both of 

20 which are described in detail below) which fully explain the rationale for the 

correlation, though the ultimate usefulness of the methods of the present invention are 
independent of any particular model. 

Within this context, the DNA Polymorphism-Diet-Cofactor-Development model 
maintains that a developmental disorder such as schizophrenia results in part from 
25 developmental brain damage sustained in utero due to maternal dietary deficiency of 
folate, pyridoxine or cobalamin potentiated by the aggregate effect of minor defects 
of folate, pyridoxine or cobalamin genes. The maternal damage to the fetus can result 
in part from insufficiency of the folate, pyridoxine and cobalamin themselves and/or 
from resulting effects such as immune deficiency and maternal teratogens, e.g. 
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hyperhomocysteinemia. Genes from either parent acting in the fetus may modify 
these damaging effects as exempUfied in the gene-teratogen model, below. 

As described herein the present invention can be practiced on a case by case basis, or 
alternatively, it can be used in the screening of the general population, or within any 
5 particular subgroup, such as newborns (as is presently performed in the diagnosis and 
treatment of hyperphenylalaninemia). 

Therefore, if appearing herein, the following terms shall have the definitions set out 
below. 

As used herein a "gene involved in folate, pyridoxine, or cobalamin metabolism" is a 
10 gene that encodes a peptide or protein that plays a role in a pathway involved in either 
folate, pyridoxine, or cobalamin metaboUsm. An incomplete hsting of examples of 
such proteins is given in Tables 2-7. 

As used herein the term "individual" includes a fetus, infant, child, adolescent, and 
adult. Therefore, as used herein, an individual originates at conception. 

1 5 As used herein an individual with a susceptibihty for "having offspring that develop a 
developmental disorder" is meant to be indicative of the susceptibility of the 
offspring of that individual to develop the developmental disorder and is not in any 
way meant to be indicative of the susceptibihty of the individual to have offspring. 

The term "proband" as used herein is operationally defined by Table 8 along with the 
20 accompanying explanatory information (see. Example 1). For most purposes, the 
proband can be considered the central figure in the famihal analysis, the remaining 
individuals in the family being designated as "blood relatives". There are three types 
of probands: (1) an "affected proband" i.e., an individual that is believed to have a 
developmental disorder ; (2) a "control proband" an individual that is beheved not to 
25 have a developmental disorder; and (3) a "diagnostic proband" /. e. , an individual 
being diagnosed. 
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As used herein a "blood relative" of an individual is a relative that is related to the 
individual in a genetic sense. Blood relatives can include mothers, fathers, children, 
uncles, aunts, brothers, sisters, and grandparents. Preferably a blood relative is a 
parent, a sibling, or a grandparent. Adopted relatives, step-parents, relatives through 
5 marriage and the like are not blood relatives. Therefore, as used herein, the terms 
"mother", "father", "sibling", "grandparent", "grandfather" and "grandmother" are 
indicative of blood relationships. 

As used herein a "mate of an individual" is a person whose genetic material is 
combined with that of the individual for the conception of the offspring in question. 

10 As used herein the term "schizophrenia" describes a disorder that is at least partially 
due to one or more genetic mutations or polymorphisms in one or more genes 
involved in folate, cobalamin or pyridoxine metabohsm in an individual that is 
schizophrenic and/or to one or more genetic mutations or polymorphisms in one or 
more genes involved in folate, cobalamin or pyridoxine metabolism in the mother of 

15 that individual. 

As used herein an individual is "schizophrenic" when the individual displays 
si^mptoms that would be accepted by an experienced psychiatrist to merit a diagnosis 
of schizophrenia. Such a diagnosis is based, at least in part, on the currently evolving 
guidelines for the diagnosis of schizophrenia which are listed in the successive 
20 editions of Diagnostic and Statistical Manual for Mental Disorders, put out by the 
American Psychiatric Association. The current edition is the DSM, Fourth Edition 
(1994). 

As used herein the terms "spina bifida cystica", "Tourette's syndrome", "bipolar 
illness", "autism", "conduct disorder", "attention deficit disorder", "obsessive 
25 compulsive disorder", "chronic multiple tic syndrome" and "learning disorders" such 
as "dyslexia"describe disorders which display symptoms that would be accepted by 
an experienced psychiatrist to merit a diagnosis of that disorder. Such a diagnosis is 
based, at least in part, on the currently evolving guidehnes which are listed in the 
successive editions of Diagnostic and Statistical Manual for Mental Disorders, put out 
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by the American Psychiatric Association. The current edition is the DSM, Fourth 
Edition (1994). 

As used herein the term "teratogenic locus" indicates one or more alleles that act in a 
pregnant woman to cause an intrauterine teratogenic effect on the fetus. 

5 As used herein the terms "specificity locus" or "modifying locus" are used 
interchangeably and are indicative of one or more alleles that can act during 
pregnancy and/or after birth to prevent, modify, and/or ameliorate the teratogenic 
effect of the teratogenic locus. 

As used herein a "constellation of genetic mutations" is the set of genetic risk factor 
1 0 mutations that is present in a proband and relatives of the proband. One example of a 
constellation of genetic mutations is shown in a line of Table 8, below. 

As used herein a "risk factor" is a teratogen or substance (including a defective gene) 
that can lead to a teratogenic effect that is present or suspected of being present in a 
tissue sample or body fluid of an individual's mother during the individual's gestation 
1 5 and/or present or suspected of being present in a tissue sample or body fluid of the 
individual. 

As used herein a "genetic risk factor" is used interchangeably with the term "genetic 
explanatory variable" and is a genetic mutation and/or polymorphism that causes or 
potentially can cause the formation of and/or lead to the development of a risk factor 
20 in an individual or the individual's mother during gestation. 

As used herein an "environmental risk factor" is used interchangeably with the term 
"environmental explanatory variable" and is an envirormiental factor that causes or 
potentially can cause the formation of and/or lead to the development of a risk factor 
in an individual or the individual's mother during gestation. 
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As used herein an "explanatory variable" is either an "environmental explanatory 
variable" or a "genetic explanatory variable" or the variable defined by their 
interaction or any combination of the above. 

Enzymes whose deficiency may raise plasma homocysteine include 
5 methylenetetrahydrofolate reductase (MTHFR), methionine synthase, and folate 
receptors/transport proteins/binding proteins (as well as all of the proteins listed in 
Tables 2-7 below). 

The current (developmental) model for schizophrenia is that genetic and 
environmental factors cause brain damage in a fetus that later develops schizophrenia. 

10 However, the genetic and environmental factors have not been identified. Also, 
extensive linkage and association studies have failed to identify genes determining 
schizophrenia. The reasons usually given for this difficulty include: (i) locus 
heterogeneity, i.e., more than one gene locus is involved, perhaps many gene loci 
each with a small effect; (ii) the mode of inheritance of schizophrenia is unknown; 

15 and (iii) an additional possible factor is that the frequency of the disease alleles may 
be high, thus greatly reducing the power of hnkage studies. 

The DNA Polymorphism-Diet-Cofactor-Development model explains all of these 
difficulties and at the same time proposes a unified metabolic abnormality. The 
unified metabolic abnormahty is: (a) ENVIRONMENTAL, i.e., due to a 

20 folate/cobalamin/pyridoxine deficiency caused by either decreased ingestion or 
increased requirement during pregnancy; (b) GENETIC, i.e., due to a 
folate/cobalamin/pyridoxine genetic defect caused by the aggregate effect of multiple 
mutations of folate/cobalamin/pyridoxine genes each individually having a small 
effect; and (c) the interaction of the folate/cobalamin/pyridoxine environmental and 

25 genetic factors (indicated above) to cause other harmful effects such as maternal 
teratogens and immune deficiency during gestational development. Different gene 
loci and different combinations of gene loci will be involved in different patients and 
different famiUes. The problem of locus heterogeneity is addressed by the hypothesis 
that the folate/cobalamin/pyridoxine genetic defect is the aggregate effect of multiple 
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mutations of folate/cobalamin/pyridoxine genes each of which have a relatively small 
effect. 



The problem of mode of inheritance is addressed by the gene-teratogen model. The 
gene-teratogen model describes the special features of genes acting in utero; both 
5 teratogenic and modifying of specificity loci may be involved. If these effects are not 
taken into account, the assignment of affection status in schizophrenia pedigrees is 
inaccurate. Assignment of affection status is a key element in defining the mode of 
inheritance for all kinds of linkage mapping. Failure to assign the correct mode of 
inheritance is another factor that has made the linkage studies very difficult. 

1 0 Finally, the DNA Polymorphism-Diet-Cofactor-Development model proposes that 
some of the genetic factors for schizophrenia are common in the population. In fact, 
subclinical deficiency of folate, pyridoxine, and cobalamin is common in the 
population and common among pregnant women as well. Pregnancy further 
increases the requirement for folate, pyridoxine, and cobalamin. Common genetic 

15 polymorphisms of folate and cobalamin genes are also known, some of them 

functional. Common genetic risk factors tend to be functional polymorphisms and/or 
mutant alleles that individually have small effects. Otherwise, they would be largely 
eliminated firom the population by natural selection and would not be common. High 
disease allele firequency is yet another factor that greatly diminishes the power of a 

20 linkage study. 

Besides explaining the difficulties with current Unkage studies, the DNA 
Polymorphism-Diet-Cofactor-Development model explains all of the unusual 
biological and epidemiological features of schizophrenia: e.g. the decreased amount 
of gray matter in brain areas, the unusual birth-month effect, the geographical 

25 differences in incidence, the socioeconomic predilection, the association with 

obstetiical abnormahties (low birth weight and prematurity), and the association with 
famine and viral epidemics. Consistently, genetic linkage and cytogenetic studies in 
schizophrenia have impUcated various chromosome regions, some of them containing 
folate, pyridoxine, and cobalamin genes including dihydrofolate reductase, 

30 thymidylate synthase, and transcobalamin II. The DNA 
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Polymorphism-Diet-Cofactor-Development model predicts that folate, pyridoxine, or 
cobalamin gene mutations have a high frequency in schizophrenia patients or family 
members. Furthermore, mothers of schizophrenics are predicted to be particularly 
susceptible to producing one or more teratogens during pregnancy. 

5 The present invention therefore provides methods for: (a) Diagnostic testing of 
schizophrenia by identifying a folate, pyridoxine, or cobalamin gene mutation or 
constellation of mutations in the patient, mother, and father, (b) Prevention of 
schizophrenia by diagnostic testing in famihes already affected by schizophrenia or 
by diagnostic population screening for folate mutations and identifying couples at risk 

1 0 for producing schizophrenic offspring. These pregnancies can be further monitored 
for risk factors, e.g. dietary folate/pyridoxine/cobalamin, plasma 
folate/pyridoxine/cobalamin, or red blood cell folate; plasma homocysteine or other 
teratogens, (c) Therapy for schizophrenia, e.g. , treating the pregnant mother with 
folate, pjoidoxine, cobalamin or other agents. The treatment can be monitored at 

1 5 regular intervals to determine the effect of therapy, (d) Presymptomatic treatment of 
schizophrenia on young children found to be susceptible to schizophrenia by 
diagnostic testing for folate gene mutations and other risk factors can also be treated 
with methylfolate or related therapeutic modalities to forestall the appearance of 
schizophrenia symptoms in adolescence or adulthood. 

20 Empirical studies with methylfolate treatment of schizophrenia have shown modest 
chnical improvement. The DNA Pol)anorphism-Diet-Cofactor-Development model 
gives a rationale for such therapy as well as for intensive testing of related therapeutic 
modalities. Genetic testing will need to be carried out in such patients to gauge their 
likelihood of responding to therapy. In addition, the DNA 

25 Polymorphism-Diet-Cofactor-Development model gives direction and impetus 
toward uncovering the mechanism of fetal brain damage leading to schizophrenia. 

Diagnostic testing for schizophrenia can involve testing not just the patient, but 
mother and father as well, for not just one factor but multiple genetic factors. For 
example, data for two gene loci (both folate-related genes) were used in Example 2. 
30 In this case, there were only four explanatory variables for each comparison. 
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In addition, risk factors appearing only during pregnancy may play a role, e.g. dietary 
folate which can be fiirther monitored during the pregnancy. In certain instances, 
genotype data can be used as the sole explanatory variables, particularly in the case 
when no environmental explanatory variables are known. In such a case, the 
predicted probabilities will be only for the genetic component of the proband's risk of 
schizophrenia. In addition, schizophrenia mothers, fathers, and sibs do not 
necessarily have to come from the same famihes as the schizophrenia probands, as 
described in Example 2. 

Of course certain genetic factors will turn out to be more common than others. This 
may simplify testing somewhat. Also some genetic factors may operate chiefly in the 
mother, while others will operate chiefly in the schizophrenic patient. This may also 
simphfy testing. There are some approaches to assessing risk factors during a past 
pregnancy, e.g. current dietary history as an indicator of past diet, methionine loading 
as in indicator of how susceptible a mother is to raising her plasma homocysteine, 
assessment of other risk factors besides folate metabohsm that may affect pregnancy 
outcome. Procedures including all of these variables are both envisioned and 
included in the present invention. 

Thus the present invention provides a method of diagnosis of schizophrenia. In one 
aspect of the invention, diagnostic testing for genetic susceptibility to schizophrenia 
determines the probability that the proband is affected with schizophrenia due to 
genetic factors. This is carried out by genetic testing of a patient suspected of having 
schizophrenia and/or whatever informative relatives are available, e.g. mother, father, 
sibs, or children. The genotypes of certain folate and/or cobalamin and/or pyridoxine 
gene mutations or constellation of mutations (folate and/or cobalamin and/or 
pyridoxine gene mutations) are determined for each individual. 

Since the abnormal phenotype of schizophrenia can be determined by both genetic 
and environmental factors and since other genetic factors besides 
folate/cobalamin/pyridoxine gene mutations may be involved, the presence of 
folate/cobalamin/pyridoxine gene mutations may be neither necessary nor sufficient 
to cause schizophrenia. Thus, an unaffected individual may have the same genetic 
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risk factors as an affected individual but may lack sufficient environmental factors to 
cause the abnormal clinical disease. Also, an affected individual may lack 
folate/cobalamin/pyridoxine gene mutations but may have other related or non-related 
genetic risk factors that caused the schizophrenia. 

5 Therefore folate/cobalamin/pyridoxine gene mutations are used as explanatory 

variables (genetic risk factors) to calculate the predicted probability that an individual 
has genetic susceptibility to schizophrenia due to these mutations. Genetic variation 
can be expected to account for approximately about half of the risk of developing 
schizophrenia since the concordance rate in identical twins has been estimated to be 

10 about 50%. The other half of the risk resuhs fi-om environmental factors due to their 
different positions in the uterus and to differences in the blood supply. The use of 
environmental factors as additional explanatory variables enhances this probability 
calculation, although this environmental data is more difficuh to gather. Together, 
using both genetic and environmental explanatory variables, the predicted probability 

15 that an individual is schizophrenic may approach 1.0. 

One likely situation for the use of the present methodology is in the diagnosis of a 
patient that has developed a psychosis. In such a case, the clinician is likely to be 
interested in determining the probabihty that this individual has schizophrenia. The 
number of blood relatives (preferably first degree relatives) of the patient-to-be 

20 diagnosed, both unaffected and affected, could then be determined. The number of 
these who would contribute a blood sample for analysis, for example, could then be 
ascertained. It is preferable that the patient-to-be-diagnosed also contributes a blood 
sample, however in certain situations, this may not be an option. The availabihty of 
dietary and epidemiological information for environmental explanatory variables, 

25 especially from the patient and the mother, can also ascertained. Of course all 

relevant legal and ethical rules should be followed regarding informed consent for the 
genetic testing. 

Biological samples such as tissue or fluid samples (e.g., 7 ml of blood in an 
EDTA-containing vacutainer, see Example 2, below), and obtainable environmental 
30 data from the patient and family members are then collected. DNA is extracted from 
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the sample and genotypes for alleles of folate and/or cobalamin and/or pyridoxine 
genes are determined. The methods for genotyping depend upon the specific genetic 
markers used as explanatory variables. The methods for allele determination for two 
genetic markers are discussed in the Examples below. 

5 Data of the genetic and environmental explanatory variables for the 

patient-to-be-diagnosed (proband) and participating family members are added to a 
reference data set preferably consisting of well-defined schizophrenia probands and 
family members, and control probands, and family members for whom data is 
available for many explanatory variables. As an approximation the control probands 

10 themselves also can be used as the controls for each proband family member class as 
shown in Example 2, below. Thus, as an approximation the control probands can be 
used as controls for the affected probands; and/or separately for the mothers of 
affected probands; and/or separately for the fathers of affected probands, etc. Another 
example of a use of the control probands is in the evaluation and/or analysis of a 

1 5 particular diagnostic proband. In this case, the approximation is obtained by adding 
the diagnostic proband to the group of affected probands and control probands. 

A model is then created consisting of the explanatory variables actually available 
from specific patient-to-be diagnosed and family members participating in the testing. 
This new combined data set (reference data set and data from patient-to-be-diagnosed 
20 with participating family members) is analyzed by binary logistic regression (e.g., 
using a statistical software package such as the SAS System embodied in Example 1 
below, though other programs may be used) for the model chosen giving the 
predicted probability that a proband is affected with schizophrenia for all of the 
probands including the patient-to-be-diagnosed. 

25 In a particular embodiment the model is modified and the goodness of fit for the 
patient-to-be-diagnosed is checked. The predicted probability that the 
patient-to-be-diagnosed has schizophrenia is compared with a classification table 
generated from the model used to determine the likelihood of false positives and false 
negatives. 
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The predicted probability that the patient-to-be-diagnosed is affected with 
schizophrenia, with the hkehhood of false positive or false negative result, can then 
be forwarded to the clinician. 

The methods for determining an individual's risk for developing schizophrenia taught 
by the present invention can be used in a variety of settings. For example, the present 
invention also provides a therapy for schizophrenia. Empirical studies with 
methylfolate treatment of schizophrenia have shown modest clinical improvement. 
The DNA Polymorphism-Diet-Cofactor-Development model provides a rationale for 
such therapy as well as for intensive testing of related therapeutic modalities, e.g. 
other cofactors such as cobalamin or pyridoxine. In addition, the DNA 
Polymorphism-Diet-Cofactor-Development model gives direction and impetus 
toward imcovering the mechanism of fetal brain damage leading to schizophrenia. Of 
course such therapy also can be provided on a case by case basis in order to gauge the 
likehhood of the patient of responding to such therapy, with the methodology for 
diagnosis of the present invention enabling the skilled practitioner to assess that 
likelihood. 

In addition, the present invention provides a method of identifying individuals that 
are likely to be aided by presymptomatic treatment for schizophrenia. For example, 
young children found to have a high risk for susceptibility to schizophrenia by 
diagnostic testing can be treated with methylfolate or related therapeutic modalities to 
forestall the appearance of schizophrenia symptoms in adolescence or adulthood. 
The present invention further provides methodology for diagnostic testing for specific 
famihes already affected by schizophrenia. 

The present invention fiirther provides methodology for population screening for 
folate/cobalamin/pyridoxine mutations to help identify couples at risk for producing 
schizophrenic offspring. Subsequent or concurrent pregnancies can then be 
monitored for environmental risk factors, and treated with folate, cobalamin, 
pyridoxine or other agents and monitored at intervals for the effect of therapy. Such 
monitoring can include measiiring levels of folate, cobalamin, pyridoxine or 
homocysteine in a particular tissue and/or fluid sample, such as blood. 
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Since schizophrenia is a developmental disorder, it is likely that these same risk 
factors discussed here for schizophrenia could play a role in other developmental 
disorders including spina bifida cystica, Tourette's syndrome, learning disorders 
5 including dyslexia, conduct disorder, attention-deficit hyperactivity disorder, bipolar 
illness, autism, and obsessive-compulsive disorder. Interestingly, the mode of 
inheritance of these disorders, like that of schizophrenia, has been difficult to 
determine despite the fact that a genetic component to the etiology of each has been 
documented. Therefore, methodology analogous to that exemplified herein for 
1 0 schizophrenia can be readily adapted for diagnosing and/or treating other such 
developmental disorders. 

Nucleic Acids 

In accordance with the present invention there may be employed conventional 
molecular biology, microbiology, and recombinant DNA techniques within the skill 

15 of the art. Such techniques are explained fiilly in the literature. See, e.g., Sambrook, 
Fritsch & Maniatis, Molecular Cloning: A Laboratory Manual, Second Edition 
(1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (herein 
"Sambrook et al, 1989"); DNA Cloning: A Practical Approach, Volumes I and II 
(D.N. Glover ed. 1985); Oligonucleotide Synthesis (M.J. Gait ed. 1984); Nucleic Acid 

20 Hybridization [B.D. Hames & S.J. Higgins eds. (1985)]; Transcription And 

Translation [B.D. Hames & S.J. Higgins, eds. (1984)]; Animal Cell Culture [R.I. 
Freshney, ed. (1986)]; Immobilized Cells And Enzymes [IRL Press, (1986)]; 
B. Perbal, A Practical Guide To Molecular Cloning (1984); P.M. Ausubel et al. 
(eds.). Current Protocols in Molecular Biology, John Wiley & Sons, Inc. (1994)]. 

25 A "nucleic acid molecule" refers to the phosphate ester polymeric form of 

ribonucleosides (adenosine, guanosine, uridine or C5^idine; "RNA molecules") or 
deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or 
deoxycytidine; "DNA molecules"), or any phosphoester analogs thereof, such as 
phosphorothioates and thioesters, in either single stranded form, or a double-stranded 
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helix. Double stranded DNA-DNA, DNA-RNA and RNA-RNA helices are possible. 
The term nucleic acid molecule, and in particular DNA or RNA molecule, refers only 
to the primary and secondary structure of the molecule, and does not limit it to any 
particular tertiary forms. Thus, this term includes double-stranded DNA found, inter 
5 alia, in linear or circular DNA molecules including restriction fragments, plasmids, 
and chromosomes. In discussing the structure of particular double-stranded DNA 
molecules, sequences may be described herein according to the normal convention of 
giving only the sequence in the 5 ' to 3 ' direction along the nontranscribed strand of 
DNA {i.e., the strand having a sequence homologous to the mRNA). A "recombinant 
1 0 DNA molecule" is a DNA molecule that has undergone a molecular biological 
manipulation. 

A nucleic acid molecule is "hybridizable" to another nucleic acid molecule, such as a 
cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid 
molecule can anneal to the other nucleic acid molecule under the appropriate 

15 conditions of temperature and solution ionic strength (see Sambrook et al, supra). 
The conditions of temperature and ionic strength determine the "stringency" of the 
hybridization. High stringency hybridization conditions correspond to 50% 
formamide, 5x or 6x SSC. Hybridization requires that the two nucleic acids contain 
complementary sequences, although depending on the stringency of the hybridization, 

20 mismatches between bases are possible. The appropriate stringency for hybridizing 
nucleic acids depends on the length of the nucleic acids, the GC percentage, and the 
degree of complementation, variables well known in the art. The greater the degree 
of similarity or homology between two nucleotide sequences, the greater the value of 
T^ for hybrids of nucleic acids having those sequences. The relative stability 

25 (corresponding to higher T^) of nucleic acid hybridizations decreases in the following 
order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 
nucleotides in length, equations for calculating T^ have been derived {see Sambrook 
et al, supra, 9.50-10.51). For hybridization with shorter nucleic acids, i.e., 
oligonucleotides, the position of mismatches becomes more important, and the length 

30 of the ohgonucleotide determines its specificity {see Sambrook et al, supra, 1 1 .7- 
11.8). Preferably a minimum length for a hybridizable nucleic acid (e.g., a nucleotide 
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probe or primer such as a PCR or RT-PCR primer) is at least about 12 nucleotides; 
preferably at least about 18 nucleotides; and more preferably the length is at least 
about 27 nucleotides; and most preferably at least about 36 nucleotides. Specific 
probes and primers that can be used to distinguish specific variants of the nucleic 
5 acids encoding the proteins involved in folate, pyridoxine, and/or cobalamin 
metabolism are also part of the present invention. 

Such nucleotide probes and primers can be labeled or used to label complementary 
DNA (where appropriate) by any number of ways well known in the art including 
using a radioactive label, such as ^H, ^''C, ^^P, or ^^S, a fluorescent label, a boron label 

10 [U.S. Patent No: 5,595,878, Issued January 21, 1997 and U.S. Patent No: 5,876,938, 
Issued March 2, 1999 which are incorporated by reference in their entireties], and 
enzymatic tags such as urease, alkaline phosphatase or peroxidase. In the case of 
enzyme tags, colorimetric indicator substrates are known which can be employed to 
provide a means visible to the human eye or spectrophotometrically, to identify 

1 5 specific hybridization with complementary nucleic acid-containing samples. 

In a specific embodiment, the term "standard hybridization conditions" refers to a T^^ 
of 55°C, and utilizes conditions as set forth above e.g., 5X SSC. In a preferred 
embodiment, the is 60 °C; in a more preferred embodiment, the T„ is 65 °C. 

A DNA "coding sequence" is a double- stranded DNA sequence which is transcribed 
20 and translated into a polypeptide in a cell in vitro or in vivo when placed under the 
control of appropriate regulatory sequences. The boundaries of the coding sequence 
are determined by a start codon at the 5 ' (amino) terminus and a translation stop 
codon at the 3 ' (carboxyl) terminus. A coding sequence can include, but is not 
limited to, prokaryotic sequences, cDNA from eukaryotic mRNA, genomic DNA 
25 sequences from eukaryotic (e.g., mammahan) DNA, and even synthetic DNA 

sequences. If the coding sequence is intended for expression in a eukaryotic cell, a 
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polyadenylation signal and transcription termination sequence will usually be located 
3 ' to the coding sequence. 

"Transcriptional and translational control sequences" are DNA regulatory sequences, 
such as promoters, enhancers, terminators, and the like, that provide for the 
5 expression of a coding sequence in a host cell. In eukaryotic cells, polyadenylation 
signals are control sequences. 

A "promoter sequence" is a DNA regulatory region capable of binding RNA 
polymerase and initiating transcription of a downstream (3 ' direction) coding 
sequence. For purposes of defining the present invention, the promoter sequence is 

10 boimded at its 3 ' terminus by the transcription initiation site and extends upstream (5 ' 
direction) to include the minimum number of bases or elements necessary to initiate 
transcription at levels detectable above background. Within the promoter sequence 
will be found a transcription initiation site (conveniently defined for example, by 
mapping with nuclease SI), as well as protein binding domains (consensus 

1 5 sequences) responsible for the binding of RNA polymerase. 

A "signal sequence" is included at the beginning of the coding sequence of a protein 
to direct the protein to a particular site/compartment in the cell such as the surface of 
a cell. This sequence encodes a signal peptide, N-terminal to the mature polypeptide, 
that directs the host cell to translocate the polypeptide. The term "translocation signal 
20 sequence" is used herein to refer to this sort of signal sequence. Translocation signal 
sequences can be found associated with a variety of proteins native to eukaryotes and 
prokaryotes, and are often fimctional in both types of organisms. 

Identification of Genetic Mutations 

A biological sample can be obtained fi-om an individual and/or a blood relative of the 
25 individual, and fi-om appropriate controls, using a sample from any body component 
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including tissue punches, body fluids, and hair, as long as the biological sample 
contains nucleic acids and/or proteins/peptides. Thus the DNA, mRNA, proteins or 
peptides of the biological sample can be used to identify mutations and/or variants in 
genes involved in folate, pyridoxine, or cobalamine metabohsm. The present 
5 invention therefore includes methods of detecting and quantifying these nucleic acids 
and/or proteins/peptides that can be used to identify genetic risk factors. 

In a particular embodiment the DNA is extractable. A particularly useful source of 
DNA is blood. For example, 2.5- 40 mis of blood can be collected in a vacutainer 
containing EDTA. The blood sample is placed on ice and then centrifuged to separate 
10 plasma, red cells, and huffy coat. The separated fractions are then frozen at -80 °C. 

The DNA can be isolated from the buffy coat by a number of procedures well known 
in the art including using a QIAmp column DNA extraction procedure or the 
QIAGEN Genomic-tip method. The isolated DNA can be digested with a series of 
restriction enzymes, for example, and then the digested products can be hybridized 
1 5 with one or more particular nucleic acid probes designed from a particular gene to 
identify the gene and preferably to test for particular genetic mutations. 

Preferably the genomic DNA can be ampUfied by PGR using appropriate primer pairs 
such as the primer pairs for the MTHFR or DHFR genes which were used in the 
Example below. The PGR amphfied product can be sequenced directly, or 

20 alternatively be digested with one or more appropriate restriction enzymes. The 
resulting digested products can be separated e.g., by column chromatography, or 
preferably by poly aery lamide or agarose gel elecfrophoresis. The isolated digestion 
products can be compared e.g., by previously determined restriction maps, and/or 
alternatively, the digestion products can be sequenced directly. Alternatively, as in 

25 the case of DHFR, genetic polymorphisms can be detected through the use of 
restriction enzymes. 
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Although a restriction map of a gene is sufficient for the employment of the methods 
disclosed herein, in preferred embodiments the nucleotide sequences of the genes 
used in the testing steps are known. To this end a large sampHng of such sequences 
are provided in Tables 2-7. (These sequences may also be used in the design of 
5 restriction maps.) Thus, initially each gene whether used separately or used in a 
constellation of genes is characterized by the sequencing of the wild type gene, 
preferably including the coding regions, introns, control sequences, and other non- 
coding regions. In addition, mutations of such genes found in the general population 
can also be characterized. With the recent advances in the sequencing of the human 

10 genome the present invention contemplates that additional sequence information will 
become pubhcly available, particularly with regard to mutations in relevant introns, 
and control sequences etc. which are not available in cDNA libraries. Such sequence 
information is fully envisioned to be incorporated into the on-going compilations of 
relevant DNA sequence databases of the present invention, as well as for its parallel 

1 5 use in the general methodology described herein. Thus DNA or mRNA or cDNA 
made from the mRNA can be used to identify mutations and/or variants in genes 
involved in folate, pyridoxine, or cobalamine metabolism. 

There are many methods currently known in the art to identify variant/mutant DNA, 
all of which may be used in the present invention (see e.g., internet address 

20 http://www.ich.bpmf.ac.uk/Gmgs/mutdet.htm). Such methods include but in no way are 
limited to direct sequencing, array sequencing, matrix-assisted laser 
desorption/ionization time-of-flight mass spectrometry (Malditof) [Fitzgerald et al, 
Ann. Rev. Biophy. Biomol. Struct. 24:117-140 (1995)], Polymerase Chain Reaction 
"PGR", reverse-transcriptase Polymerase Chain Reaction "RT-PCR", RNAase 

25 protection assays, Array quantitation e.g. , as commercially provided by Affymetrix, 
Ligase Chain Reaction or Ligase AmpHfication Reaction (LCR or LAR), 
Self-Sustained Synthetic Reaction (3SR/NASBA), Restriction Fragment Length 
Polymorphism (RFLP),Cycling Probe Reaction (CPR), Single-Strand Conformation 
Polymorphism (SSCP), heteroduplex analysis, hybridization mismatch using 
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nucleases (e.g., cleavase), Southern, Northerns, Westerns, South Westerns, ASOs, 
Molecular beacons, footprinting, and Fluorescent In Situ Hybridization (FISH). Some 
of these methods are briefly described below. 

PCR is a method for increasing the concentration of a segment of target sequence in a 
5 mixture of genomic DNA without cloning or purification. PCR can be used to 
directly increase the concentration of the target to an easily detectable level. This 
process for amphfying the target sequence involves introducing a molar excess of two 
ohgonucleotide primers which are complementary to their respective strands of the 
double-stranded target sequence to the DNA mixture containing the desired target 
10 sequence. The mixture is denatured and then allowed to hybridize. Following 

hybridization, the primers are extended with polymerase so as to form complementary 
strands. The steps of denaturation, hybridization, and polymerase extension can be 
repeated in order to obtain relatively high concentrations of a segment of the desired 
target sequence. The length of the segment of the desired target sequence is 
1 5 determined by the relative positions of the primers with respect to each other, and, 
therefore, this length is a controllable parameter. Because the desired segments of the 
target sequence become the dominant sequences (in terms of concentration) in the 
mixture, they are said to be "PCR-amplified." [MuUis (U.S. Patent No. 4,683,195) 
and Mulhs et al. (U.S. Patent No. 4,683,202)] 

20 In Ligase Chain Reaction or Ligase Amplification Reaction (LCR or LAR) four 
oligonucleotides, two adjacent oligonucleotides which uniquely hybridize to one 
strand of target DNA, and a complementary set of adjacent oligonucleotides, which 
hybridize to the opposite strand are mixed and DNA ligase is added to the mixture. 
Provided that there is complete complementarity at the junction, ligase will covalently 

25 hnk each set of hybridized molecules. Importantly, in LCR, two probes are hgated 
together only when they base-pair with sequences in the target sample, without gaps 
or mismatches. Repeated cycles of denaturation, hybridization and hgation ampHfy a 
short segment of DNA. [Barany, Proc. Natl. Acad. Sci., 88:189 (1991); Barany, PCR 
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Methods and Applic, 1:5 (1991); and Wu and Wallace, Genomics 4:560 (1989)] 
LCR has also been used in combination with PCR to achieve enhanced detection of 
single-base changes. Segev, PCT Public. No. W09001069 Al (1990). 

Self- Sustained Synthetic Reaction (3SR/NASBA) is a transcription-based in vitro 
5 ampUfication system [Guatelh et al. , Proc. Natl. Acad. Sci., 87: 1 874- 1 878, 7797 
(1990); Kwok et al, Proc. Natl. Acad. Sci., 86:1 173-1 177) that can exponentially 
amplify RNA sequences at a uniform temperature. The amplified RNA can then be 
utilized for mutation detection (Fahy et al., PCR Meth. AppL, 1 :25-33 (1991). In this 
method, an oligonucleotide primer is used to add a phage RNA polymerase promoter 
10 to the 5' end of the sequence of interest. In a cocktail of enzymes and substrates that 
includes a second primer, reverse transcriptase, RNase H, RNA poljonerase and 
ribo-and deoxyribonucleoside triphosphates, the target sequence undergoes repeated 
rounds of transcription, cDNA synthesis and second-strand synthesis to amplify the 
area of interest. 

1 5 RFLP can be used to detect DNA polymorphisms arising from DNA sequence 
variation. This method consists of digesting DNA with one or more restriction 
endonucleases (e.g. , EcoRI) and analyzing the resulting fragments by means of 
Southern blots [Southern, E., Methods in Enzymology, 69:152 (1980)], as further 
described by Botstein, et al.. Am. J. Hum. Genet., 32:314-331 (1980) and White, et 

20 al., Sci. Am., 258:40-48 (1988). Since a DNA polymorphism may create or delete a 
restriction site, the length of the corresponding restriction fragment with any given 
restriction enzyme could change. Once a difference in a restriction fragment length is 
identified it can be used to readily distinguish a particular polymorphism from the 
wild type DNA. Mutations that affect the recognition sequence of the endonuclease 

25 will preclude enzymatic cleavage at that site, thereby altering the cleavage pattern of 
that DNA. DNAs are compared by looking for differences in restriction fragment 
lengths. A technique for detecting specific mutations in any segment of DNA is 
described in Wallace, et al.,{Nucl. Acids Res., 9:879-894 (1981)]. It involves 
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hybridizing the DNA to be analyzed (target DNA) with a complementary, labeled 
oligonucleotide probe. Due to the thermal instabihty of DNA duplexes containing 
even a single base pair mismatch, differential melting temperature can be used to 
distinguish target DNAs that are perfectly complementary to the probe from target 
5 DNAs that differ by as little as a single nucleotide. In a related technique, described in 
Landegren, et al. Science, 41:1077-1080 (1988), oligonucleotide probes are 
constructed in pairs such that their junction corresponds to the site on the DNA being 
analyzed for mutation. These oligonucleotides are then hybridized to the DNA being 
analyzed. Base pair mismatch between either oligonucleotide and the target DNA at 
1 0 the j unction location prevents the efficient j oining of the two oligonucleotide probes 
by DNA ligase. 

When a sufficient amount of a nucleic acid to be detected is available, there are 
advantages to detecting that sequence directly, instead of making more copies of that 
target, {e.g., as in PCR and LCR). Most notably, a method that does not amplify the 

1 5 signal exponentially is more amenable to quantitative analysis. Even if the signal is 
enhanced by attaching multiple dyes to a single oligonucleotide, the correlation 
between the final signal intensity and amount of target is direct. Such a system has an 
additional advantage that the products of the reaction will not themselves promote 
further reaction, so contamination of lab surfaces by the products is not as much of a 

20 concern. Traditional methods of direct detection including Northern and Southern 
blotting and RNase protection assays usually require the use of radioactivity and are 
not amenable to automation. Recently devised techniques have sought to ehminate 
the use of radioactivity and/or improve the sensitivity in automatable formats. 

One such example is the Cycling Probe Reaction (CPR) [Duck et al, BioTech., 9:142 
25 (1990)]. CPR , uses a long chimeric oligonucleotide in which a central portion is 
made of RNA while the two termini are made of DNA. Hybridization of the probe to 
a target DNA and exposure to a thermostable RNase H causes the RNA portion to be 
digested. This destabihzes the remaining DNA portions of the duplex, releasing the 
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remainder of the probe from the target DNA and allowing another probe molecule to 
repeat the process. The signal, in the form of cleaved probe molecules, accumulates 
at a linear rate. While the repeating process increases the signal, the RNA portion of 
the oligonucleotide is vulnerable to RNases that may carried through sample 
5 preparation. 

Single-Strand Conformation Polymorphism (SSCP) is based on the observation that 
single strands of nucleic acid can take on characteristic conformations in 
non-denaturing conditions, and these conformations influence electrophoretic 
mobility. [Hayashi, PCR Meth. AppL, 1:34-38, (1991). The complementary strands 

10 assume sufficiently different structures that one strand may be resolved from the 
other. Changes in sequences within the fragment will also change the conformation, 
consequently altering the mobility and allowing this to be used as an assay for 
sequence variations (Orita, et al. Genomics 5:874-879, (1989). The SSCP process 
involves denaturing a DNA segment (e.g., a PCR product) that is labeled on both 

1 5 strands, followed by slow electrophoretic separation on a non-denaturing 

polyacrylamide gel, so that intra-molecular interactions can form and not be disturbed 
during the ran. This technique is extremely sensitive to variations in gel composition 
and temperature. 

In Fluorescent In Situ Hybridization (FISH), specific probes are designed which can 
20 readily distinguish the wild-type gene from the variant/mutant gene. Such 

methodology allows the identification of a variant/mutant gene through in situ 

hybridization (U.S. Patent No. 5,028,525, Issued July 2, 1991; U.S. Patent No. 

5,225,326, Issued July 6, 1993; and U.S. Patent No. 5,501,952, Issued March 26, 

1996. FISH does not require the exfraction of DNA. In addition, procedures for 
25 separating fetal blood cells from maternal blood cells are well known in the art 

allowing the fetus and the mother to be analyzed from the same body fluid sample 

{see U.S. Patent No: 5,629,147, Issued May 13, 1997). 
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Similarly, antibodies raised against specific mutations and/or variants in the gene 
products of the genes involved in folate, pyridoxine, or cobalamine metabolism can 
be used to identify specific polymorphisms. Alternatively, antibodies raised against 
the wild type proteins can be used to detect and/or quantify the amount of wild type 

5 protein present in a given biological sample. In the case in which cross-reacting 
protein isn't synthesized by the cells of an individual, or is synthesized in 
significantly lower amounts than those of control subjects, such determinations can be 
used to identify a genetic risk factor. In addition, these antibodies can be used in 
methods well known in the art relating to the locahzation and activity of the gene 

10 products, e.g. , for Western blotting, imaging the proteins in situ, measuring levels 
thereof in appropriate physiological samples, etc. using any of the detection 
techniques known in the art. Furthermore, such antibodies can be used in flow 
cytometry studies, in immunohistochemical staining, and in immunoprecipitation 
which serves to aid the determination of the level of expression of a protein in the cell 

15 or tissue. 

In the particular instance when the gene product is an enzyme, e.g., dihydrofolate 
reductase, the enzymatic activity of a biological sample can be indicative of the 
presence of a genetic risk factor. In a particular embodiment, a decrease in an enzyme 
activity that is associated with folate, pyridoxine, or cobalamine metabohsm can be 
20 indicative of the presence of the genetic risk factor. Such assays can be performed on 
multiple samples such as on a microplate reader [Widemarm et al, Clin Chem. 
45:223-228 (1999)]. 

MODEL 1 

The Gene-Teratogen Model for the Inheritance Pattern of Certain Developmental 
25 Disorders 

Introduction : 

It has long been known, e.g. from extensive studies of exogenous teratogens in inbred 
mice [Finnell and Chemoff, Gene-teratagen interactions: an approach to 
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understanding the metabolic basis of birth defects. In Pharmacokinetics in 
Teratogenesis,Vol. 11:97-109 Experimental Aspects In Vivo and In Vitro, CRC Press, 
Inc, Boca Ratan, Fl. (1987)], that teratogens may be influenced by genetic factors. It 
is less well known that the same gene defect may cause different clinical disorders 
5 depending upon whether the metabolic effect of the gene defect is exerted during 
gestation in utero or during postnatal hfe. However, the consequences of 
gene-teratogen interactions in human pedigrees have not been extensively explored, 
especially the consequences for the use of linkage mapping to identify an unknown 
gene acting in utero to cause a developmental disorder. A number of common human 

1 0 developmental disorders have been shown to have a genetic component to their 

etiology. However, for certain developmental disorders, the mode of inheritance has 
been difficult to determine and linkage studies have met with unexpected difficulties 
or have achieved limited success. These developmental disorders include spina bifida 
cystica [Chatkupt, Am J Med Genet, 44:508-512 (1992)], Tourette's syndrome & 

15 related disorders, e.g. obsessive-compulsive disorder and chronic multiple tics 
syndrome [Pauls, Adv Neurol, 58:151-157 (1992); McMahon et al, Adv Neurol, 
58:159-165 (1992); Heutink et al.. Am J Hum Genet, 57:465-473 (1995); Grice et al. 
Am J Hum Genet, 59:644-652 (1996)], learning disorders, including dyslexia [Lewis, 
et al, Behav Genet, 23:291-297 (1993); Pennington, / Child Neurol 10 Suppl 1:S69- 

20 S77 (1995)], conduct disorder [Lombroso et al, J Am Acad Child Adolesc Psychiatry, 
33:921-938 (1994)], attention-deficit hyperactivity disorder [Lombroso et al, J Am 
Acad Child Adolesc Psychiatry, 33:921-938 (1994)], bipolar illness [Baron, Acta 
Psychiatr Scand, 92:81-86 (1995); Benjamin and Gershon, Biol Psychiatry, 40:313- 
316 (1996); Risch and Botstein, Nature Genet, 12:351-353 (1996); Jamison and 

25 Mclnnis, Nature Med, 2:521-522 (1996); Morell, Science, 272:31-32 (1996)], 

schizophrenia [Owen, Psychol Med, 22:289-293 (1992); Cloninger, Am J Med Genet, 
54:83-92 (1994); Lander andKruglyak, Nature Genet, 11:241-247 (1995); Baron, 
Acta Psychiatr Scand, 92:81-86 (1995); Benjamin and Gershon, Biol Psychiatry, 
40:313-316 (1996); Baron, Am J Med Genet, 67:121-123 (1996)], autism [Lombroso 

30 et al, J Am Acad Child Adolesc Psychiatry, 33 :921-938 (1994)], and 
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obsessive-compulsive disorder in adults [Lombroso et al, J Am Acad Child Adolesc 
Psychiatry, 33:921-938 (1994)]. A recent article [Moldin, Nature Genet 17:127-129 
(1997)] has reviewed "The maddening hunt for madness genes." 

The present model addresses the question of the mode of inheritance of certain 
5 developmental disorders and proposes the "gene-teratogen model." The model 

suggests that the mode of inheritance of genes acting prenatally may in some cases be 
fundamentally different from that of genes acting postnatally. Even the same gene 
acting prenatally may produce a different disorder from that gene acting postnatally. 
The inheritance pattern in the gene-teratogen model is simple, but from the 

10 perspective of the patient with the developmental disorder is neither dominant nor 
recessive. Some disorders regarded as multifactorial, polygenic, or oligogenic may 
have this mode of inheritance. In the gene-teratogen model, genetically determined 
teratogen production by the mother during pregnancy damages the fetus producing 
the abnormal phenotype of a developmental disorder. The model is illustrated with 

1 5 two types of loci, 1 . a teratogenic locus acting in the mother, and 2. a modifying or 
specificity locus acting in the fetus. Damage by the teratogen is influenced also by 
environmental factors. The model is interesting because it is simple and because 
teratogenic loci will be difficult to locate by parametric or non-parametric linkage 
mapping techniques due to misspecification of the affection status of both mother and 

20 affected children. A study design is suggested for identifying teratogenic loci. An 

example of the gene-teratogen model is the major intrauterine effect seen in offspring 
of phenylketonuric mothers. Certain developmental disorders whose mode of 
inheritance has been difficult to determine or whose genetic factors have been 
difficult to locate are candidates for the gene-teratogen model, including spina bifida 

25 cystica, Tourette's syndrome, learning disorders including dyslexia, conduct disorder, 
attention-deficit hyperactivity disorder, bipolar illness, schizophrenia, autism, and 
obsessive-compulsive disorder. 
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The Gene Teratogen Model 



The model is described in Table 1 using two kinds of loci: a "teratogenic" locus and a 
"modifying" or "specificity" locus. The gene-teratogen model requires a teratogenic 
locus. One or more modifying or specificity loci may or may not be present. Also, 
5 two types of phenotypes are defined: 1. the teratogen-induced phenotype; and 2. the 
teratogenic phenotype, i.e. , the phenotype of a mother that produces a teratogenic 
effect during pregnancy. The two phenotypes are different for the teratogenic locus 
but are identical for the modifying or specificity loci. 

TABLE 1 

10 DIAGRAM OF THE GENE-TERATOGEN MODEL 



Grandparents: 


Maternal 
Grandmother 
AabbCCdd 


Maternal 
Grandfather 
AaBbCcdd 


Paternal 
Grandmother 
AAbbCcDd 


Paternal 
Grandfather 
AAbbCCdd 


Parents: 


Mother 
aaBbCcdd 


Father 
AAbbCcDd 


Child: 


Child (fetus) with developmental disorder 
AabbccDd 


locus A: 


teratogenic locus, recessive, acting in the mother to cause 
intrauterine teratogenic damage to the fetus. 


locus B: 


teratogenic locus, dominant, acting in the mother to cause 
intrauterine teratogenic damage to the fetus. 


locus C: 


modifying or specificity locus, recessive, acting in the fetus. 


locus D: 


modifying or specificity locus, dominant, acting in the fetus. 



The teratogenic locus may be dominant (locus A) or recessive (locus B). This locus 
acts in the mother during pregnancy to cause an intrauterine teratogenic effect in the 
20 fetus. The teratogenic effect may result from the production of an endogenous 

teratogen, fi-om potentiation of an exogenous teratogen, fi-om a metabolic deprivation 
or imbalance or from some other mechanism. Only one teratogenic locus is required; 
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both locus A and locus B are shown on the same diagram for simplicity. A specificity 
or modifying locus may be dominant (locus C) or recessive (locus D). Such a locus 
acts during pregnancy or after to modify the extent of the developmental damage 
done by the teratogenic locus or even to prevent or repair the damage. For example, 
5 for a teratogen acting at a certain time in development, locus C or D may determine 
whether brain or kidney is damaged, which structures of the brain are damaged, or 
whether damage occurs at all. 

1. Locus A, recessive teratogenic locus, acting in the mother: The child is the patient 
with the abnormal phenotype of a specific developmental disorder, while mother, 

10 father, and grandparents do not have the abnormal phenotype of that disorder (Table 
1). Locus A acts in the mother during pregnancy causing her to produce the 
teratogenic effect that damages the developing fetus leading to the developmental 
disorder either in the fetus or postnatally in the child or aduh. Since this locus is 
recessive in action, the mother, a homozygote (aa) for the disease allele, is the genetic 

1 5 "patient." Her abnormal phenotype, the "teratogenic phenotype", is the trait of 
producing the teratogenic effect during pregnancy. Her fetus, damaged by the 
teratogenic effect in utero, does develop the teratogen-induced phenotype. However, 
the fetus is only a heterozygote (Aa) at locus A and thus lacks both the abnormal 
homozygous genotype at locus A and the abnormal teratogenic phenotype; e.g., if the 

20 fetus is a daughter, she will not produce the teratogenic effect later during pregnancy. 
Thus, the fetus is affected with the developmental disorder but is not the genetic 
"patient." Locus A, acting through a teratogenic effect, cannot be the only etiological 
factor for the developmental disorder. If it were, then all pregnancies of an aa mother 
would have the teratogen-induced phenotype which is not the case. Environmental 

25 and/or other genetic factors, are required. An aa father will have the abnormal 

genotype, but not the abnormal teratogenic phenotype because he could never become 
pregnant. 
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2. Locus B, dominant teratogenic locus acting in the mother: The situation is the 
same as for locus A except that locus B is dominant in action (Table 1). The mother 
has the abnormal genotype, Bb, and the abnormal teratogenic phenotype. The fetus 
has the teratogen-induced phenotype but in the instance shown (Table 1) has neither 

5 the abnormal genotj^e, the teratogenic phenotype, nor even a copy of the disease 
allele. The maternal grandfather shown (Table 1) has the abnormal genotype, Bb, but 
does not have the teratogenic phenotype because he could never become pregnant. 

3. Environmental effects: The teratogenic effect is modified by environmental 
factors, e.g. maternal dietary factors, infection, or ingestion of teratogen. These 

10 environmental factors may interact with locus A or B or may act independently. From 
the perspective of the fetus later to develop the developmental disorder 
(teratogen-induced phenotype), intrauterine teratogenic is an environmental not a 
genetic effect. 

4. Modifying or Specificity Loci Acting in the Fetus, Loci C&D: These loci may 

1 5 interact with the teratogenic locus or the environmental factors to increase or decrease 
their effect, or alternatively could act independently. Such genetic factors may be 
recessive (locus C) or dominant (locus D). Genotypes and pheno types of locus C and 
D behave conventionally with respect to the developmental disorder. For locus C and 
D, the fetus is with the developmental disorder is now the genetic "patient". Maternal 

20 teratogenic in utero is an environmental effect. It is thus possible that the same gene 
locus could act in part as a teratogenic locus and in part as a modifying or specificity 
locus. 

DISCUSSION 

The Example of Phenylketonuria: An example of the gene-teratogen model is the 
25 major intrauterine effect in maternal phenylketonuria (PKU). Phenylketonuria itself 
is a recessive postnatal disorder. Untreated homozygous PKU mothers and fathers 
both have elevated blood phenylalanine (hyperphenylalaninemia). However, 
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heterozygous offspring of untreated PKU mothers (but not fathers) have an abnormal 
phenotype.[Koch et al. Acta Paediatr Suppl, 407:11 1-119 (1994); Allen et al, Acta 
Paediatr Suppl, 407:83-85 (1994); Abadie et al, Archives Pediatr, 3:489-486 
(1996)]. Thus the elevated blood phenylalanine or other metabolite(s) in the mother 
5 acts as a teratogen for the fetus. Note that the fetus of an untreated phenylketonuric 
mother does not have the phenotype of PKU (the "teratogenic phenotype"), but has a 
different phenotype (the "teratogen-induced phenotype"). 
Phenylketonurics [Menkes, Textbook of Child Neurology, Lea & Febiger, 
Philadelphia (1990)] are normal at birth and develop a progressive disorder 

10 postnatally characterized by vomiting, eczema, seizures (infantile spasms with 

hypsarrythmia on electroencephalography), and mental retardation. The fetus of an 
untreated phenylketonuric mother [Menkes, Textbook of Child Neurology, Lea & 
Febiger, Philadelphia (1990)] has a congenital non-progressive disorder of fetal 
origin characterized by microcephaly, abnormal facies, mental retardation, congenital 

15 heart disease, and prenatal and postnatal growth retardation. The PKU phenotype is a 
postnatal degenerative disorder; the phenotype of the PKU intrauterine effect is a 
developmental disorder. The teratogenic effect is not dependent upon the fetal 
genotype, although the fetus is an obligate heterozygote since the mother is a 
homozygote for phenylketonuria and the father (usually) has the normal genotype. 

20 Thus, in phenylketonuria, a mutation at the same gene locus causes two distinct 
disorders depending upon whether the period of abnormal gene 
action is prenatal or postnatal. A fetus with the abnormal homozygous genotype who 
is carried by a heterozygous mother is protected in utero, but develops PKU 
postnatally. A heterozygous fetus carried by a mother with the abnormal 

25 homozygous genotype is damaged in utero when the mother's genotype predominates, 
but is protected from PKU postnatally by its own genotype. 



An Example from Studies in Inbred Mice: Finnell and Chemoff [Gene-teratagen 
interactions: an approach to understanding the metabolic basis of birth defects. In 
Pharmacokinetics in Teratogenesis,Vol. 11:91 -\Q9 Experimental Aspects In Vivo and 
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In Vitro, CRC Press, Inc, Boca Ratan, Fl. (1987)] have reviewed a group of elegant 
experiments in inbred mice documenting that differences in susceptibility to 
exogenous teratogens can be regarded as a genetic trait that is determined by 
susceptibility or liability genes of either the maternal or fetal genotype [Fimiell and 
5 Chemoff, Gene-teratagen interactions: an approach to understanding the metabolic 
basis of birth defects, In Pharmacokinetics in Teratogenesis,Vol. 11:97-109 
Experimental Aspects In Vivo and In Vitro, CRC Press, Inc, Boca Ratan, Fl. (1987)]; 
Finnell et aL,Am J. Med. Genet. 70:303-311 (1997); Bennett et al.,Epilepsia 38:415- 
423 (1997)]. For example, sensitivity to acetazolamine-induced ectrodactyly is 

1 0 determined by the presence of three genes, and the fetus must be homozygous for the 
recessive allele at all three loci in order to express the malformation. However, the 
inbred mouse models used do not mirror the human situation in at least three respects. 
First, the human population is an outbred population compared to these inbred mouse 
models. Consequently, the relevant genotypes may be highly variable among 

15 members of different families. Second, the inbred mouse experiments address the 
question of exogenous rather than endogenous teratogens. Third, the inbred mouse 
studies rely upon known or candidate susceptibihty loci, whereas in humans, the 
problem has been to locate and identify disease unknown loci largely by using linkage 
mapping techniques. 

20 Implications for Linkage Mapping: 

Teratogenic Locus (LocusA or B): The gene-teratogen model has major implications 
for linkage mapping done with either parametric or non-parametric methods. The 
problem for both methods is incorrect assignment of affection status. In the lod score 
method, a genetic model of the disease is constructed and an affection status is 

25 assigned to each member of the pedigree. If the genetic model specified is wrong, the 
linkage results may be falsely positive or falsely negative [TerwilUger and Ott, 
Handbook of Human Genetic Linkage, Johns Hopkins Univ. Pr., Baltimore (1994)]. 



47 

In developmental disorders resulting from the gene-teratogen model, the phenotype 
assignment for lod score analysis will be incorrect. The patient with the 
developmental disorder will be assigned the affected phenotype, whereas the patient 
is actually affected only for the teratogen-induced phenotype, but is unaffected for the 
5 teratogenic phenotype. Likewise, the mother will be assigned the unaffected 
phenotype for linkage analysis. Actually, she is unaffected only for the 
teratogen-induced phenotype, but is affected for the teratogenic phenotype. Lod 
scores should increase when phenotype assignments have been corrected. However, 
apparently dominant inheritance may in fact turn out to be pseudodominant if the 

1 0 mutant allele is common in the population. For non-parametric analysis, a similar 
misassignment occurs. In the case of affected sib-pairs, the affected sibs will be 
assigned the affected phenotype. Actually, the sibs are affected only for the 
teratogen-induced phenotype, but are unaffected for the teratogenic phenotype. The 
mother will be assigned the unaffected or unknown phenotype. Actually, she is 

1 5 unaffected only for the teratogen-induced phenotype but is affected for the teratogenic 
phenotype. Thus, the "affected sib-pair" families are likely to turn out to contain only 
a single sporadic case, since the only individual in the kindred affected with the 
teratogenic phenotype will be the mother. 

For the transmission/disequilibrium test (TDT) [Spielman et al., Am J Hum Genet, 
20 52:506-516 (1993); Ewens and Spielman, Am J Hum Genet, 57:455-464 (1995)] the 
patient with the developmental disorder will be assigned the affected phenotype. 
Actually, the patient will be affected only for the teratogen-induced phenotype but 
will be unaffected for the teratogenic phenotype. The mother will be assigned the 
unaffected or unknown phenotype. Actually, she is unaffected only for the 
25 teratogen-induced phenotype but is affected for the teratogenic phenotype. The 

expectation of TDT is that alleles of a linked locus will show distortion from random 
transmission from mother (or father) to the patient. Since the patient is unaffected for 
the teratogenic phenotype, no fransmission distortion from mother (or father) to child 
will be observed. Transmission distortion for alleles of a teratogenic locus will in fact 
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occur from the mother's parents to the mother, the actual patient for the teratogenic 
phenotype. But this will not be looked for because the phenotypes have been wrongly 
assigned. In addition, grandparents of the patients with the developmental disorder 
have probably not had DNA collected. Therefore, for the TDT, negative results may 
5 occur for disease alleles of a teratogenic locus because incorrect phenotype 

assignments will have been made. When correct phenotype assignments have been 
made, transmission distortion to the mother from her parents should be expected for 
disease alleles of a teratogenic locus. Analogous misassignments are made in allelic 
association and haplotype relative-risk analyses [Falk and Rubinstein, Ann Hu, Genet, 
10 51:227-233 (1987); Terwilliger and Ott, Hum Hered, 42:337-346 (1992); Thomson, 
Am J Hum Genet, 57:487-498 (1995)]. 

Modifying or Specificity Loci (Locus C and/or D) : Since these loci behave in a 
conventional fashion, the phenotype assignments will be correct. Consequently, 
genes identified by conventional parametric or non-parametric linkage studies are 

1 5 likely to be modifying or specificity loci. An important question for linkage mapping 
is the relative contribution to the abnormal phenotype of the developmental disorder 
made by the teratogenic locus versus that of a modifying or specificity locus. If the 
effect of a teratogenic locus is small, then loci identified by conventional linkage 
studies will be specificity or modifying loci and the mode of inheritance will be 

20 MendeUan or multifactorial. If a teratogenic locus makes a major contribution to 
phenotype, then linkage mapping studies will not give a consistent answer and the 
mode of inheritance will be difficult to determine. 

The presence of a teratogenic locus may be suspected if the maternal contribution to 
phenotype is different from or greater than the paternal contribution. For example, 
25 the mother's relatives of spina bifida infants more frequently have affected children 
than the father's relatives. Suggested explanations for this observation have been 
mitochondrial inheritance, maternal effect, or genomic imprinting [Chatkupt, Am J 
Med Genet, 44:508-512 (1992)]. The operation of a teratogenic locus is another 



explanation and is itself a form of maternal effect. For a recessive teratogenic locus, 
the mother's sisters would be at greatest risk of having offspring with the 
teratogen-induced phenotype. 

Implications for Definition of Phenotype: All the pregnancies of a mother with the 
5 teratogenic phenotype are at risk for the developmental disorder, the 

teratogen-induced phenotype. Yet only a few of the fetuses will be affected by the 
developmental disorder because of the action of environmental factors and/or the 
modifying or specificity loci. The action of the environmental factors is fully 
quantitative: depending upon the amplitude of the environmental effect, a mild, 

10 moderate, or severe teratogen-induced phenotype may result. In addition, the 
environmental factor may act at different times in fetal development producing 
quahtatively different phenotypes. Thus, quantitatively or qualitatively different 
teratogen-induced phenotypes may result fi-om pregnancies of the same mother with 
the teratogenic phenotype. In addition, the action of the modifying or specificity loci 

1 5 may produce quantitatively or qualitatively different phenotypes in offspring of the 
same couple. Such different phenotypes may be diagnostically classified as different 
disorders. This may complicate attempts at associating specific loci with a 
specific teratogen-induced phenotype. All of the teratogen-induced phenotypes 
resulting firom pregnancies of a mother with the teratogenic phenotype modified only 

20 by environmental factors are genetically indistinguishable. However, such 

teratogen-induced phenotypes affected also by the various modifying or specificity 
loci segregating among the offspring of a single couple are only partially genetically 
related. 

Methods to Identify Teratogenic Loci: One effective approach to finding a putative 
25 teratogenic locus is to carry out non-parametric linkage studies of families consisting 
of a patient affected with the developmental disorder, the patient's two (unaffected) 
parents, and the patient's four (unaffected) grandparents (Table 1). In such a family, 
the mother is the genetic patient but the other family members are not. Now, the 
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mother's nuclear family (the mother and her parents) is compared with the father's 
nuclear family (the father and his parents). In a haplotype relative risk study, the 
disease allele(s) of the teratogenic locus will occur more frequently in the mother 
compared with other alleles of her parents; the disease allele(s) of the teratogenic 
5 locus will not occur more frequently in the father compared with other alleles of his 
parents. In a transmission/disequilibrium test, transmission distortion will be seen for 
the disease allele(s) of a teratogenic locus in the mother's nuclear family but not in the 
father's nuclear family. In an allelic association study, the disease allele will occur 
more frequently in mothers, patients (with the developmental disorder), and patient's 
10 sibs (both affected and unaffected) than in unrelated control individuals. Disease 

allele frequency in fathers will not be distinguishable from that in control individuals. 

Certain developmental disorders with a genetic component to etiology, whose mode 
of inheritance has been difficult to determine or whose genetic factors have been 
difficult to locate, including those mentioned earher, are candidates for the 
1 5 gene-teratogen model. 

MODEL 2 : 

The DNA Polymorphism-Diet-Cofactor-Development Hypothesis 
for Schizophrenia and Other Developmental Disorders 

Folate metabolism is complex. At least 30 gene loci are involved in absorption, 
20 transport, and metabolism of folate, and these are regulated by additional gene loci. 
Any of these is potentially a genetic risk factor for schizophrenia, although MTHFR 
and DHFR are particularly good candidates. Likewise, genes encoding proteins 
involved in the pathways of other vitamin-cofactors may be genetic risk factors. 



25 



Two cofactors that may be of particular potential importance are cobalamin and 
pyridoxine. Cobalamin is relevant because its metabolism is closely intertwined with 
that of folate. For example, cobalamin is required for the activity of methionine 
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synthase (MTR), a folate-related enzyme. Decreased cobalamin can affect folate 
metabolism through the folate trap. Pyridoxine is relevant because the 
pyridoxine-dependent enzyme cystathionine beta-synthase (CBS), along with the 
cobalamin-dependent enzyme MTR and folate pathways including MTHFR and 
5 DHFR all participate in catabolism of homocysteine, an amino acid that is suspected 
of being a teratogen during pregnancy. Also, kynureninase, an important enzyme 
affecting niacin metabolism and serotonin s5Tithesis is pyridoxine-dependent. 
Therefore, mutations of the genes encoding such proteins, especially common 
polymorphisms, could play a role in the cause of schizophrenia. 

10 Since folate, cobalamin, and pyridoxine are all dietary constituents, the dietary 
content of these co factors could be lead to an "environmental" generation of a risk 
factor for schizophrenia. In addition genes encoding proteins involved in folate, 
cobalamin, and pyridoxine metabolism and catabohsm could be genetic risk factors 
for schizophrenia. Thus, the cofactors and the proteins involved in pathways relevant 

15 to these cofactors can potentially have either or both environmental and genetic 
effects on the susceptibiUty of an individual on schizophrenia. 

Since the genetic aspect of schizophrenia differs so profoundly from other disorders 
which have been identified by linkage mapping techniques, it is clear that a new 
model for the genetic connection to schizophrenia is required. Therefore, the DNA 
20 Polymorphism-Diet-Cofactor-Development (DDCD) hypothesis, is disclosed herein. 

The DDCD hj^Dothesis is that interacting genetic and environmental factors affecting 
the metabolism of folate, cobalamin, or pyridoxine or all of these, play a role in the 
etiology of schizophrenia. The genetic effect results from the aggregate effect of 
multiple mutations that individually, for the most part, have small effects on folate-, 
25 cobalamin- or pyridoxine-related genes, some of which will be common in the 
population, and can act in utero. Environmental factors include dietary folate and 
cobalamin and pyridoxine. If schizophrenia results from mild deficiency during fetal 
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development of dietary folate, cobalamin, or pyridoxine potentiated by mild genetic 
susceptibility mutations of genes related to these cofactors and by pregnancy, then 
this would be difficult to document by linkage mapping techniques. An example of 
interaction of genetic and environmental factors is that genetic factors are important 

5 for incorporating dietary folate; the enzyme dihydrofolate reductase is required for 
conversion of dietary folate to folinic acid thus allowing dietary folate to enter the 
body's metaboHc pathways. Another example is that folate and cobalamin 
requirements increase during pregnancy; thus pregnancy could potentiate the effects 
of mild genetic defects of mother, fetus, or both. Deficiencies of a vitamin are often 

10 part of a broader dietary deficiency affecting muhiple nutrients in addition to the 
vitamin being measured. 

Locus Heterogeneity: The metabolic pathways of folate, cobalamin, and pyridoxine 
are complex and related to each other. Multiple gene loci code for the enzymes and 
transport proteins are required (Tables 2-7). Thus, a defect of folate, cobalamin, or 
1 5 pyridoxine metabolism could result from the aggregate effect of multiple mutations 
each of relatively small effect interacting with environmental factors. Different 
individuals might have different combinations of mutations. Such a metaboHc defect 
would be difficult to detect by Hnkage mapping techniques because of locus 
heterogeneity. 

20 Alternatively, even if one genetic defect were sufficient to make an individual more 
susceptible to having schizophrenic offspring, for example, because of the large 
number of potential genetic factors, and the corresponding importance of 
environmental factors, elucidation of such an individual genetic defect would still be 
difficuh unless, of course, the genetic defect caused a major effect. The difficulty in 

25 elucidating an individual genetic defect is magnified when the genetic factor acts in 
the mother, and not in the schizophrenic patient. 
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High Disease Allele Frequency: Numerous mutational variants of folate and 
cobalamin genes are known. Some of these have functional significance and in 
addition are sufficiently common in a given population to be regarded as genetic 
polymorphisms. However, these common alleles are unhkely to have a major 

5 harmful effect by themselves, for if they did they would become uncommon in the 
population in the absence of selection effects, and would likely appear as Mendelian 
disorders. Thus, the folate, cobalamin, or pyridoxine disease alleles related to 
schizophrenia would appear to be more likely those of minor deleterious effect or 
those with harmful effect only in the presence of environmental deficiencies or 

1 0 pregnancy. Such disease genes of high population frequency will be difficult to 

detect by linkage mapping methods because high disease allele frequency decreases 
the power of linkage studies [Terwilliger and Ott, Handbook of Human Genetic 
Linkage, John Hopkins Univ. Press, Baltimore, (1994)]. 

Developmental Genes: Folate, cobalamin, and pyridoxine defects act prenatally as 
15 well as postnatally. Folate, cobalamin, and pyridoxine metaboHsm are crucial for 
DNA synthesis and cell division, which are of disproportionate importance during 
brain development. Some defects of folate, cobalamin, or pyridoxine metaboUsm 
elevate blood homocysteine, a toxic and potentially teratogenic substance. Genes 
acting in the mother to damage the developing fetus, e.g. via the gene-teratogen 
20 model (Model 1 , above), have a mode of inheritance that is neither dominant nor 
recessive with respect to the fetus. Attempts to assign a mode of inheritance in this 
situation will be unsatisfactory because affection status would be incorrectly assigned. 
The mode of inheritance of a developmental disorder resulting from a teratogenic 
locus would be regarded as either multifactorial or unknown. This is the situation 
25 with schizophrenia whose mode of inheritance is unknown. Use of an incorrect 
genetic model decreases the power of a linkage studies [TerwiUiger and Ott, 
Handbook of Human Genetic Linkage. John Hopkins Univ. Press, Baltimore, 
(1994)]. 
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Genes of Folate Metabolism: Folate metabolism is extremely complex [Rosenblatt, 
In: The Metabolic and Molecular Bases of Inherited Disease, Scriver et al. (eds), New 
York: McGraw-Hill, pp. 3111-3128 (1995); Mudd et al. In: The Metabolic and 
Molecular Bases of Inherited Disease, Scriver et al (eds). New York: McGraw-Hill 
pp. 1279-1327 (1995)]. At least 30 gene loci (Table 2) have been identified as 
folate-related. These contribute to folate mediated 1 -carbon transfer reactions, 
binding, transport and metabohsm of folate, and other functions. A number of these 
have been cloned and locaUzed to a chromosomal region (Table 3). 
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TABLE 2 

FOLATE-RELATEDGENES/ENZYMES/TRANSPQRTERS^ 



10 



Folate-Related Genes/Enzymes/Tranporters" 


SEQ ID NO: 


methylenetetrahydrofolate reductase, MTHFR, MIM 236250 


1 


methionine synthase (methyltetrahydrofolateiL-homocysteine 
S-methyltransferase), MTR,MIM 156570 


2 


dihydrofolate reductase, DHFR, MIM 126060 


3 


folylpolyglutamate synthase, FPGS, MIM 136510 


4 


folate receptor 1, folate receptor alpha (FOLRl, adult; FR-alpha), 
MIM 136430 


5 


folate receptor 2, folate receptor beta (FOLR2, fetal; FR-beta), 
MIM 136425 (a.a.) 


6 


folate receptor 2-like (FOLR2L, fetal-like), MIM-none 




folate receptor gamma (FR-gamma), MIM 602469 


7 


serine hydroxymethyltransferase 1, SHMTl, MIM 182144 


8 


methylenetetrahydrofolate dehydrogenase, methenyltetrahydrofolate 
cyclohydrolase, lO-formyltetrahydrofolate synthetase (trifiinctional enzyme, 
MTHFD), MIM 172460 


9 


serine hydroxymethyltransferase 2, SHMT2, MIM 138450 


10 


thymidylate synthase, TYMS, MIM 188350 


11 


GAR (5-phosphoribosylglycineamide) transformylase, GART, MIM 138440 


12 


reduced folate carrier- 1 , RFCl . Probably identical to micro molar membrane 
transport protein, intestinal folate carrier-1 (IFCl), and neutral folate transport 
protein. MIM 600424 


13 


cystathionine beta-synthase, CBS, MIM 236200 


14 


AICAR (5-phosphoribosyl-5 -aminoimidazole-4-carboxamide) transformylase 


15 


glutamate formimino transferase, MIM 229100 




forminotetrahydrofolate cyclodeaminase 




5, 10-methenyltetrahydrofolate synthetase 


16 


10-formyltetrahydrofolate dehydrogenase, Mim 600249 
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Folate-Related Genes/Enzymes/Tianporters" 


SEQ ID NO: 


glycine cleavage pathway (SHMT plus three enzymes): 
MIM 238331 
Gly-decarboxylase MIM 238300 
H-Protem MIM 238330 
T-Protein MIM 238310 


18 
19 


cblG (affects fimction of MTR), MIM 250940 




methionine adenosyltransferase 1, MATIA, (ATP:L-methionine S- 
adenosyltransferase), MIM 250850 


20 


pteroyl polyglutamate hydrolase ("conjugase"), form 1 




pteroyl polyglutamate hydrolase ("conjugase"), form 2 




NAD-dependent enzyme methylene tetrahydro folate dehydrogenase 
cyclohydrolase (a. a.) 


21 


methionine adenosyltransferase 2, MAT2A, MIM 601468 


22 


5-methyltetrahydrofolate- homocysteine methyltransferase reductase (MTRR) 
MIM 602568; #Variant m MTRR linked to cblE MIM 236270 


23 


methyltranferases 




S-adenosyhnethionine decarboxylase, MIM 180980 


24 


decarboxylated S-adenosylmethionine:putrescine propylaminotransferase or 
spermidine synthetase (a.a.) 


25 


S-adenosylhomocysteine hydrolase, , MIM 180960 


26 


betaine-homocysteine methyltransferase dimethylthetin-homocysteine 
methyltransferase 


27 


gamma-cystathionase (L-cystationine cysteine-lyase (deaminating)), 
MIM 602888 


28 


folic acid transport protein, MIM 229050 




DHFR (exon 6 and 3 'flanking region) 


30 


kynureninase 


35 


human DHFR, exons 1 and 2 [Chen et al.,J. Biol. Chem. 259:3933-3943 
(1984)] 


36 


''listed with alternate names, abbreviations, and MIM numbers; 

#cbIE is a phenotype for a particular group of disorders of folate/cobalamin metabolism, 
(a.a.) indicates the amino acid sequence 



TABLES 

LOCALIZED GENE LOCI RELATED TO FOLATE METABOLISM 





Gene/enzyme/transport protein 


Location 


References 




MTHFR 


lp36.3 


Goyette et a/., (1994); 


5 


MTR 


lq43 


Cook and Hamerton, (1979); 
Mellmane/ al, (1979) 
** 




DHFR 


5qlL2-13.2 


Weiffenbach et al., (1991) 
Gilliam a/. (1989b) 

* ** 




FPGS 


9cen-q34 


Jones and Kao (1984): 
Walter al. (1992) 




MAT 


10q22 


** 




FR 


1 Iqi3.3-qi4.1 
llql3.3-113.5 


Lacey ci cii. v^i^o:?^, 
Ragoussis et al, (1992); 
Ratnumer al. (1989); 
Walter a/. (1992); * 
Ragoussis et al, (1992), ** 


10 


SHMT2 


12ql2-ql4 
12ql3 


Garrow et al, (1993); 
Law and Kao, (1979) * 

** 




MTHFD 


14q24 


Rozene^ al., (1989), 
Jones et a/. (1981), *, ** 




LCCL 


16pter-qter 






SHMTl 


17pll.2 


Garrow a/., (1993) *, ** 




TYMS 


18pll.31.-pll.22 
18pll.32 


* 

Hori et al., (1990); 
Silverman al., (1993) 


15 


SAHH 


20cen-ql3.1 






GART 


21q22.1 


Mclnnis et al. (1993) 
Schild et al. (1990) 
Avrarmopoulos et al. (1993) 
Goto et al. (1993) 




RFCl 


21q22.2-22.3 


Moscow et al., (1995) 
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Gene/enzyme/transport protein | Location 



CBS 



21q22.3 



Munkee^a/.,(1988) 



notes: MTHFR=methylenetetrahydrofolate reductase. MTS=methionine s 
DHFR=dihydrofolate reductase. FPGS=folylpolyglutamate synthase. 
MAT=methionine adenosyltransferase, (ATP:L-methionine S-adenosyltransferase). 
FR=folate receptor complex: FR-alpha=FOLRl=folate receptor 1, adult; FR- 
beta=FOLR2=folate receptor 2, fetal; FR-gamma; FOLR2L=folate receptor 2-like. 
SHMT2=serine hydroxymethyltransferase 2, mitochondrial. MTHFD=5, 10- 
methylenetetrahydro folate dehydrogenase, 5, 10-methylenetetrahydro folate 
cyclohydrolase, 10-formytetrahydrofolate synthase (trifunctional enzyme). 
LCCL=gamma-cystathionase (L-cystathionine cysteine-lyase (deaminating). 
SHMTl=serine hydroxymethyltransferase 1, soluble. TYMS=thymidylate 
synthetase. SAHH, S-adenosylhomocysteine hydrolase. 

GART=phosphoribosylglycineamide formyl transferase. RFCl=reduced folate 
carrier-1 (possibly identical to IFCl, intestinal folate carrier-1). CBS=cystathionine 
beta-synthase. Location information from GOD (*), from MIM (**). 

Goyette et al.,Nat. Gen. 7:195-200 (1994) 

Cook and Hamerton, Cytogenet Cell Genet. 25:9-20 (1979) 

Mellman et al., Proc. Natl. Acad. Sci. 76:405-409 (1979) 

Weiffenbache/a/.,Ge«oOTic.s 10:173-185 (1991) 

Gilliam et al. Genomics 5:940-944 (1989b) 

Jones and Kao Cytogenet Cell Genet. 37: 499 (1984) 

Walter et al. Ann. Hum. Genet. 56:212 (1992) 

Lacey et al. Am.J. Med. Genet. 60:172-173 (1989) 

Ragoussis et al, Genomics 14:423-430 (1992) 

Ratnum et al. Biochem. 28. 8249-8254 (1989) 

Garrow et al. J. Biol. Chem. 268:11910-11916 (1993). 

Law and Kao, Cytogenet Cell Genet, 24: 102-1 14 (1979) 

Rozen et al, Ann. Hum. Genet, 44:781-786 (1989) 

Jones et al. Somat. Cell Genet. 7:399-409 (1981) 

nor\etal.,Hum. Genet S5:576-5S0 (1990) 

Silverman et al.. Genomics 15:442-445 (1993) 

Mclnnis et al. Genomics 16:562-571 (1993) 

Schild et al. Proc. Natl. Acad. Sci 87.-2916-2920 (1990) 

Avrarmopoulos et al. Genomics 15:98-102 (1993) 

Goto et al. Neuromusc Disord. 3: 157-160 (1993) 

Moscow et al. Cancer Res. 55:3790-3794 (1995) 

Munke et al. Am J. Hum. Gen. 42:550-559 (1988) 



Genes of Cobalamin Metabolism: Cobalamin metabolism is also complex [Benton 
and Rosenberg, In: The Metabolic and Molecular Bases of Inherited Disease, 
40 Disease, Scriver et al (eds), New York: McGraw-Hill, 3 129-3 149 (1995)]. At least 
15 gene loci (Table 4) have been identified as cobalamin-related. These contribute to 
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the binding, transport, and metabolism of cobalamin, and its functions. A number of 
these have been cloned and localized to a chromosomal region (5). Cobalamin 
metabohsm is closely intertwined with that of folate. For example, cobalamin is 
required for the activity of MTR, a folate-related enzyme. Decreased cobalamin can 
affect folate metabolism through the folate trap [Rosenblatt, In: The Metabolic and 
Molecular Bases of Inherited Disease, Scriver et al. (eds). New York: McGraw-Hill, 
pp. 3111-3128 (1995); Quadros et al, Biochem. Biophys. Res. Commun., 222:149- 
154(1996)]. 
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TABLE 4 



COBALAMIN-RELATEDGENES/ENZYMES/TRANSPORTERS^ 



Cobalamin-Related Genes/Enzymes/Tranporters^ 




(gastric) intrinsic factor, GIF, MIM-261000 (combined deficiency of 
GIF & R-binder, MIM 243320 


31 


intrinsic factor receptor, IFCR, MIM-261 100 




transcobalamin I, TCI (an R-protein, plasma), MIM 189905 


32 


transcobalamin III, TCIII (an R-protein, plasma), MIM-none 




other R-proteins (R-binders, cobalophylins, haptocorrins), MIM 193090 




transcobalamin II, TCII MIM 275350 


33 


transcobalamin II receptor, TCII receptor, MIM-none 




methylmalonyl Co- A mutase, MCM (MUT locus), MIM 251000 


34 


cblF, lysosomal cbl efflux, MIM 277380 




cblC, cytosolic cbl metabolism, MIM 277400 




cblD, cytosolic cbl metabolism, MIM 277410 




cblA, mitochondrial cbl reduction, (AdoCbl synthesis only), MIM 
251100 




cblB, cob(I)alamin adenosyltransferase, (AdoCbl synthesis only), MIM 
251110 




cblE, methyltransferase-associated cbl utilization, MIM 236270 




cblG, methyltransferase-associated cbl utiUzation, MIM 250940 




"listed with alternate names, abbreviations, and MIM numbers 
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TABLES 



LOCALIZED GENE LOCI RELATED TO COBALAMIN METABOLISM 





Gene/en2yme/transport protein 


Location 


References 




MCM (MUT locus) 


6p21.2-p21.1 


Qureshi al. (1994) * 


5 


IF/GIF 


Ilql2-ql3 


Hewit etal.(l99l)* 




TCI (an R-protein, plasma) 


llqll-ql2.3 


Johnston al, (1992) 
Sigal a/., (1987), * 




TCII 


22qll.2-ql3 
22ql2/13 border 


Uetal, (1995) 


10 


notes: MCM=methymalonyl Co-Amutase; IF/GIF=(gastric) intrinsic factor; 
TCI=transcobalmin I; TCII=transcobalamin II. Location information from GDB (*), from 
MIM (**). 


15 


Qureshi et al, Crit. Rev. Oncol. Hematol. 17:133-151 (1994) 

Hewit et al, Genomics 10:432-440 (1991) 

Johnston et a/.,Genomics 12:459-464 (1992) 

Sigal et al.,N. Engl. J. Med. 317:1330-1332 (1987) 

Li et al.. Biochem. Biophys. Res. Comm. 208:756-764 (1995) 



Genes of Pyridoxine Metabolism: Pyridoxine metabolism is also complex with three 
dietary forms convertible to pyridoxal phosphate [Whyte et al, Hypophosphatasia, 
In: The Metabolic and Molecular Bases of Inherited Disease, Scriver et al. (eds), New 
York: McGraw-Hill pp. 4095-41 1 1 (1995)] and many pyridoxine-related and 

20 pyridoxine-dependent enzymes including decarboxylases and all aminotranferases 
(Table 6). A number of pyridoxine-related enzymes have been cloned and localized 
to a chromosomal region (Table 7). Pyridoxine metabolism is related to folate 
metabolism, especially 1 -carbon transfer reactions: both serine 
hydroxymethyltransferases and the P-protein (glycine decarboxylase) of the glycine 

25 breakdown system are pyridoxine-dependent. 
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TABLE 6 

SOME PYRJDOXINE-REJ.ATED GENES/ENZYMES/^ 

1. cystathionine beta-synthase, CBS, MIM 236200 

2. gamma-cystathionase, MIM 219500 

(L-cystathionine cysteine-lyase, deaminating), LCCL 

3. glycine cleavage system (GCS): glycine decarboxylase (P-protein) 



4. serine hydro xymethyltransferase 1, SHMTl, 

5. serine hydroxymethyltransferase 2, SHMT2, 
10 6. kynureninase 

7. all aminotransferases, 

(e.g. omithine-gamma-aminotranferases, OAT, ) 

8. decarboxylases, 

15 e.g. glutamic acid decarboxylases, GADl, GAD2, 

9. pyridoxamine(pyridoxine)-5' -phosphate oxidase 



MIM 182144 



MIM 266100 



MIM 603287 



^hsted with alternate names, abbreviations, and MIM numbers. 



TABLE? 

SOME LOCALIZED GENE LOCI RELATED TO PYRIDOXINE METABOLISM 



Gene/enzyme 

1 . GAD2 

2. GCS P-protein 

3 . GADl 

4 . OAT 

5 . SHMT2 

6 . LCCL 

7 . SHMTl 

8 . CBS 

9. PNPO (PPO) 



Location 

2q31, 
9pl3 

lOpll . 23 

10q26 

12ql2-14 

16pter-qter 
17pll.2 
21q22 .3 



References 
Bu et al . , 1992) 
Hamosh et al. 1995) 
Bu et al. 1992) 

Garrow et al . , 19 93 ; 
Law and Kao, 1979 

Garrow et al. 19 93 * ^ 
Munke et al. 198 8 
Ngo et al. 1998 



"listed with alternate names, abbreviations, and MIM numbers. 
5 Location information from GDB (*), from MIM (**). 

notes: GAD2=glutamic acid decarboxylase 2, 67 kDa. GCS=glycine cleaving system, 
P-protein=glycine decarboxylase subunit. GADl=glutamic acid decarboxylase 1, 
65 kDa. OAT=omithine-gamma-aminofranferases. SHMT2=serine 
hydroxymethyltransferase 2, mitochondrial. LCCL=gamma-cystathionase 
20 (L-cystathionine cysteine-lyase (deaminating). SHMTl =serine 

hydroxymethylfransferase 1, soluble. CBS=cystathionine beta-synthase. PNPO= 
pyridoxamine(pyridoxine)-5 '-phosphate oxidase 

References : 

Bu et al., Proc. Nat. Acad. Scl, 89:21 15 (1992). 
25 Hamosh et al., In: "The Metabohc and Molecular Bases of Inherited Disease", 
Scriver et al. (eds). New York: McGraw-Hill pp.1337-1348 (1995). 

Garrow etal.J. Biol. Chem. 268:11910-11916 (1993). 

Law and Kao, Cytogenet Cell Genet, 24: 102-1 14 (1979). 

Munke et al. Am J. Hum. Gen. 42:550-559 (1988). 
30 Ngo et al. Biochemistry 57." 7741-7748 (1998). 
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Relevance of Folate, Cohalamine, And Pyridoxine to Schizophrenia: There is 
considerable evidence that schizophrenia results, at least in part, from damage to brain 
development in utero that becomes symptomatic in late adolescence or early 
adulthood. The etiology of schizophrenia has both genetic and environmental 
5 components. Because folate, cobalamin, and pjoidoxine are all ingested and 
metabolized, they could potentially be both environmental and genetic factors for 
schizophrenia. Folate, cobalamin, and pyridoxine are relevant to schizophrenia in 
important ways. First, all of them are required for cell division because of their role in 
nucleic acid synthesis [Rosenblatt, In: The Metabolic and Molecular Bases of 

1 0 Inherited Disease, Scriver et al. (eds) New York: McGraw-Hill, pp. 3 1 1 1-3 128 

(1995); Benton and Rosenberg, In: The Metabolic and Molecular Bases of Inherited 
Disease, Scriver et al (eds)., New York: McGraw-Hill, 3129-3149 (1995)]. The 
developmental brain insult implicated in schizophrenia [Akbarian et al. Arch. Gen. 
Psychiatry, 50:169-177 (1993); Akbarian et al. Arch. Gen. Psychiatry, 50:178-187 

15 (1993)] is an abnormality of neurogenesis and neuronal migration, which are 

midtrimester events requiring cell division. Thus folate, cobalamin, and pyridoxine 
deficiencies could result in the widespread decreased grey matter volume observed in 
schizophrenia. 

Individuals that become schizophrenic later in hfe are more hkely to be bom during 
20 the winter and early spring [Boyd et al, Schizophr. Bull, 12:173-186 (1986); Kendell 
and Adams, Br. J. Psychiatry, 158:758-763 (1991); O'Callaghan et al, Br. J. 
Psychiatry, 158:764-769 (1991)]; this corresponds to midtrimester in late fah & 
winter. Many folate- and pyridoxine-containing foods, e.g. dark green leafy 
vegetables, are less readily available in late fall & winter in northern climates. 
25 Seasonality was found to be a major determinant of micronutrient status including 
folate status in a population of pregnant and lactating women in The Gambia where 
folate deficiency was widespread [Bates et al Eur. J. Clin. Nutr. 48:660-668 (1994)]. 
Dietary cobalamin comes from animal foods, e.g. meat, dairy products, and fish, and 
prolonged dietary insufficiency is required to produce cobalamin deficiency unless a 
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person is a strict vegetarian or already has subclinical deficiency [Sanders and Reddy, 
Am. J. Clin. Nutr., 59:11768-11 81 S (1994)]. In fact, a significant fraction of the 
population already has subclinical deficiency for folate [Lewis et ah, Ann. NY Acad. 
Sci., 678:360-362 (1993)] and for [Carmel et al, Arch. Intern. Med., 147:1995-1996 
5 (1987); Pennypacker et al, J. Am. Geriatr. Soc, 40:1 197-1204 (1992); Naurath et al., 
Lancet., 346:85-89 (1995); Allen et al. Am. J. Clin. Nutr., 62:1013-1019 (1995); 
Black et al, J. Nutr., 124:1179-1 188 (1994)]. Also, the dietary folate requirement 
increases during pregnancy [Scholl et al. Am. J. din. Nutr., 63:520-525 (1996); 
McPartlin et al. Lancet, 341:148-149 (1993)] and most women become folate 

10 deficient during late pregnancy [Giles, J. Clin. Pathol. 19:1-1 1 (1966)]. Cobalamin 
deficiency is also common during pregnancy [Gadowsky et al, J. Adolesc. Health, 
16:465-474 (1995)] although subnormal levels of vitamin B12 during pregnancy must 
be interpreted with caution [Metz et al. Am. J. Hemetol, 48:251-255 (1995)]. An 
increase in schizophrenia births has also been noticed after winter famine [Susser and 

15 Lin, Arch. Gen. Psychiatry, 49:983-988 (1992)]; Susser et al, Arch. Gen. Psychiatry, 
53:25-31 (1996)], a time when severe dietary deficiency of both folate and cobalamin 
is more Ukely. A temporary increase in the incidence of neural tube defects was 
reported in Jamaica 11-18 months following Hurricane Gilbert and was found to be 
associated with decreased dietary folate [Duff and Cooper, Am J. Pub.Health 84:473- 

20 476(1994)]. 

Schizophrenia is also associated with obstetrical complications, e.g. low birth weight 
and prematurity [Lewis and Murray, J! P^j^c/jza^r. Res., 21:413-421 (1987)]. Low 
birthweight and prematurity have also been associated with dietary folate deficiency 
during pregnancy Scholl et al. Am. J. din. Nutr., 63:520-525 (1996). 
25 Hyperhomocysteinemia is a risk factor for unexplained recurrent early pregnancy loss 
[Wouters et al, Fertil Steril, 60:820-825 (1993)] and for abruptio placentae 
[Goddijn-Wesel et al, Eur J. Obstet. Gynecol. Reprod. Biol, 66:23-29 (1996)]. 
Hyperhomocysteinemia may be related to defects in folate-, cobalamin-, or 
pyridoxine-dependent reactions [Naurath et al, Lancet., 346:85-89 (1995)]. 
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Interestingly, stillbirths and schizophrenia share a similar seasonality of birth excess 
[Torrey et al, Schizophr. Bull, 19:557-562 (1993)]. Also N2O, an anaesthetic gas 
that inhibits MTR, a cobalamin-requiring enzyme of folate metabolism, is a 
reproductive toxin for both men and women [Louis-Ferdinand, Adverse Drug React. 
5 Toxicol Rev., 13:1 93-206 (1 994)] . Methotrexate, an inhibitor of dihydro folate 
reductase (DHFR), induces abortion. 

Dietary folate deficiency and low plasma folate are common in inner city urban 
populations [SchoU et al, Am. J. din. Nutr., 63:520-525 (1996)]. Likewise, 
schizophrenia has been reported to be more common in inner city urban populations 
10 [Fuller and Bowler, Schizophr. Bull. 16:591-604 (1990)]. Also, both low folate 

intake [Schorah and Wild, Lancet., 341:1417 (1993)] and schizophrenia [Dohrenwned 
et al, Science, 255:946-952 (1992)] are correlated with lower socioeconomic status. 

Immune function is impaired in folate deficiency [LeLeiko and Chao, In: Rudolph 's 
Pediatrics, 20th ed., Stamford, CT: Appleton & Lange, pp. 1001-1010 (1996)], in 

15 cobalamin deficiency [Hitzig et al, Ciba. Found. Symp., 68:77-91 (1978)] and in 
pyridoxine deficiency [TrakatelUs et al. Postgrad Med. J. 73:617-622 (1997)] and 
deficient individuals are more susceptible to infection. Methotrexate, an inhibitor of 
dihydrofolate reductase, inhibits immune fimction [Hughes, In: Rudolph 's Pediatrics, 
20th ed., Stamford, CT: Appletone and Lange, pp. 517-519 (1997)]. And, as 

20 mentioned, dietary folate and cobalamin requirements increase during pregnancy 
[SchoU et al, Am. J. din. Nutr., 63:520-525 (1996); McPartlin et al, Lancet., 
341 : 148-149 (1993)]. This is relevant because the season-of-birth effect just 
mentioned in connection with dietary folate, or cobalamin deficiency has also been 
explained by in utero infectious illness, the "viral theory" of schizophrenia. 

25 Individuals bom following winters with severe influenza epidemics are more likely to 
develop schizophrenia [Adams et al, Br. J. Psychiatry, 163:522-534 (1993)] though 
not all studies find this effect. Although it has not been demonstrated that either the 
schizophrenia fetus or the pregnant mother actually developed influenza, the 
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histologic pattern in schizophrenia of a neuronal migration abnormality during brain 
development has been seen as compatible with a fetal viral infection [Kovelman and 
Scheibel, Biol. Psychiatry, 19:1601-1621 (1984); Bogerts et al, Arch. Gen. 
Psychiatry, 42:784-791 (1985); Akbarian et al. Arch. Gen. Psychiatry, 50:169-177 
5 (1993); Akbarian et al. Arch. Gen. Psychiatry, 50:178-187 (1993)]. Thus folate or 
cobalamin, deficiency during pregnancy could result in greater susceptibility to viral 
infection affecting mother, fetus, or both. The infectious agent could be influenza 
itself Alternatively, a severe influenza epidemic could be a "marker" of a severe 
winter, and infection by another agent could cause the brain damage. In this way, 
10 folate or cobalamin deficiency could cause the season-of-birth effect either through 
the mechanism of dietary deficiency alone, through maternal immune deficiency and 
infection, or both. 

Methotrexate, a DHFR inhibitor, is also an important therapeutic agent for 
rheumatoid arthritis. Rheumatoid arthritis has repeatedly been found to have a 
1 5 decreased fi-equency in schizophrenics, a puzzling finding that remains unexplained 
[Eaton a/., Schizophr. Res., 6:181-192 (1992)]. 

The developmental model of schizophrenia postulates that brain damage sustained in 
the second trimester of fetal life results in schizophrenia later in development [Brixey 
et al, J. Clin. Psychol, 49:447-456 (1993)]. Both folate and cobalamin are already 

20 known to contribute to a first trimester fetal nervous system malformation, spina 
bifida cysdca [Kirke et al, Q. J. Med., 86:703-708 (1993); Gordon, Brain Dev., 
17:307-31 1 (1995)], and possibly other birth defects [Shaw et al. Lancet., 346:393- 
396 (1995); Czeizel, Lancet, 345:932 (1995)]. Some studies [Whitehead et al, Q. J. 
Med., 88:763-766 (1995); van der Put et al. Lancet., 346:1070-1071 (1995); Ou et 

25 al. Am. J. Med. Genet., 63:610-614 (1996); Chatkupt et al. Am. Acad. Neurol Works 
in Progres, WIP4: (1996)] suggest that a genetic susceptibihty factor for spina bifida 
is a common allele of the folate gene, MTHFR, the nucleotide 677C->T transition 
converting an alanine residue to valine resulting in a heat-labile enzyme protein. 
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Homozygotes for this allele, about 10% of the normal population, have lower 
erythrocyte folate and plasma folate during pregnancy [Molloy et al. Lancet., 
349:1591-1593 (1997)]. Homozygotes for this allele also develop moderately 
elevated blood homocysteine [van der Put et al, Lancet., 346:1070-1071 (1995); 
5 Frosst et al, Nature Genet, 10:111-113 (1995)] in the presence of dietary folate 
deficiency. Moderate hyperhomocysteinemia is toxic to adults [Fermo et al, Ann. 
Intern. Med., 123:747-753 (1995)], and toxic to the fetus in early gestation [Wouters 
et al, Fertil Steril, 60:820-825 (1993)], and possibly teratogenic in the first trimester 
causing neural tube defects [Whitehead et al, Q. J. Med., 88:763-766 (1995); van der 

10 Put et al, Lancet, 346:1070-1071 (1995); Ou et al. Am. J. Med Genet., 63:610-614 
(1996). Thus, the MTHFR heat-labile mutation, in the presence of decreased dietary 
folate in midtrimester, could be teratogenic both through hyperhomocysteinemia and 
also through folate deficiency causing the developmental brain damage hypothesized 
in the developmental model of schizophrenia [Brixey et al, J. Clin. Psychol, 49:447- 

1 5 456 (1993)]. A second common polymorphism of MTHFR, the ntl298 A->C 
mutation could also be a genetic risk factor for spina bifida [van der Put et al. 
Lancet, 346:1070-1071 (1995]. 

Schizophrenia is a common disorder, affecting 1% or more of the population [Kamo 
etal. In: Comprehensive Textbook of Psychiatry/VI, 6th ed., Baltimore: Wilhams & 

20 Wilkins, pp. 902-9 10 (1 995)] . Thus, if a significant proportion of schizophrenia 
shares a common etiology, both the genetic susceptibility factors and the 
environmental factors must be common in the population. As mentioned earlier, a 
significant fraction of the population is already sub-clinically deficient for folate and 
for cobalamin; also, pregnancy may increase this fraction since dietary folate and 

25 cobalamin requirements increase during that time. Several functional polymorphic 
alleles of folate and cobalamin genes are also common in the population including the 
MTHFR mutations just mentioned and polymorphisms of thymidylate synthase 
[Horie et al, Cell Struct Fund, 20:191-197 (1995)], ti-anscobalamin II [Li et al, 
Biochim. Biophys. Acta., 1219:515-520 (1994)], and folate-binding proteins [Li etal. 
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1994, supra; Shen et al, Biochem., 33:1209-1215 (1994)]. Metabolic indicators of 
folate or cobalamin deficiency, e.g. hyperhomocysteinemia and 
hypermethylmalonicacidemia, are also common in the population [Naurath et al, 
Lancet., 346:85-89 (1995)]. Thus there exists a statistical basis for the hypothesis 
5 that schizophrenia is a birth defect resulting from the action during gestation of 

genetic risk factors and environmental factors related to folate and/or cobalamin that 
lead to the generation of risk factors. Such factors are sufficiently common that at 
least in principle all cases of schizophrenia could result from this mechanism. 

Finally, folate, cobalamin, and pyridoxine are relevant for schizophrenia because of 

10 findings in patients. Severe genetic deficiency of MTHFR may cause a 

"schizophrenia" phenotype [Freeman et al, N. Engl J. Med., 292:491-496 (1975); 
Regland et al, J. Neural Transm. Gen. Sect., 98:143-152 (1994)]. Genetic deficiency 
of other folate and cobalamin enzymes has been reported to cause nervous system 
disease, psychiatric disease, or schizophrenia-like illness [Mudd et al, In: The 

1 5 Metabolic and Molecular Bases of Inherited Disease, Scriver et al. (eds), New York: 
McGraw-Hill pp. 1279-1327 (1995); Hitzige^a/., Ciba. Found Symp., 68:77-91 
(1978); Cooper and Rosenblatt, ^?2«M. Rev. Nutr., 7:291-320 (1987); Shevall and 
Rosenblatt, Can. J. Neurol Set, 19:472-486 (1992); Hall, Br. J. Haematol, 80:117- 
120 (1992)]. Likewise, dietary deficiencies of folate or cobalamin may have similar 

20 effects [Cooper and Rosenblatt, Annu. Rev. Nutr., 7:291-320 (1987); Shevall and 

Rosenblatt, Can. J. Neurol ScL, 19:472-486 (1992)]. Methylfolate therapy reportedly 
improved the clinical status of schizophrenics with borderhne or definite folate 
deficiency [Godfrey et al. Lancet, 2:392-395 (1990); Procter, Br. J. Psychiatry, 
159:271-272 (1991)] although the improvement claimed was small and the finding 

25 controversial. Folate deficiency has been associated with disturbances in mood 
[Shulman, In: Folic Acid in Neurology, Psychiatry, and Internal Medicine, New 
York: Raven Pr., 463-474 (1979)], and it has been suggested that the most common 
neuropsychiatric system abnormality in severe folate deficiency is depression 
[Reynolds et al, Lancet, ii:196-198 (1984)]. Methyltetrahydrofolate reportedly 
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improved symptoms of depression in an open trial in elderly depressed patients 
[Guaraldi et al. Ann.Clin.Psychiatry 5:101-105 (1993)]. Schizophrenics are reported 
to have an 80% excess mortality from cardiovascular disease [Gottesman, 
Schizophrenia Genesis, Schizophrenia Genesis- The Origins of Madness, W.H. 
5 Freeman & Co. N.Y.(1991)]; hyperhomocysteinemia, dietary folate deficiency and 
the MTHFR 677C->T mutation have been implicated in cardiovascular disease in 
some studies [Morita et al. Circulation, 95:2032-2036 (1997)] but not others 
(Anderson et al, J. Am. Coll. Cardiol. 30:1206-1211 (1997)]. Also, kynureninase, an 
important enzyme of tryptophan metabolism, affecting niacin metabolism and 

10 serotonin synthesis, is pyridoxine-dependent. Niacin deficiency (pellagra) can cause 
mental changes including psychosis and hallucinations [Wilson, Vitamin deficiency 
and excess, pp.472-480. In: Harrison 's Principles of Internal Medicine, (Scriber et al. 
e's.) McGraw-Hill, Inc., N.Y. (1994)]. Also, clozapine, resperidone, and olanzapine 
are thought to exert their antipsychotic effect in schizophrenia in part through 

15 serotonin receptor antagonism. 

Gene Localization Studies in Schizophrenia and Folate/Cobalamine/Pyridoxine 
Genes: If folate, cobalamin, or pyridoxine genes are susceptibility factors for 
schizophrenia, it is possible that gene localization studies have already identified 
candidate chromosome regions that contain such a gene (Tables 3, 5, and 7). For three 
20 folate or cobalamin genes, DHFR, TCNII and TYMS, there is excellent concordance 
with schizophrenia gene localization studies. 

On chromosome 5, DHFR has been located at 5ql 1.2-13.2. A schizophrenia 
translocation [t(l;5)(lq32.3;5qll.2-13.3)] was reported [McGillivray et al. Am. J. 
Med. Genet, 35:10-13 (1990); Bassett, Br. J. Psychiatry, 161:323-334 (1992)] 
25 affecting 5ql 1 .2-5ql3.3. A proband and uncle, both with schizophrenia and 

eye-tracking abnormalities, had partial trisomy for 5ql 1.2-5ql3.3; the third copy was 
inserted at lq32.3 giving a derivative chromosome, der(l)inv 

ins(l;5)(q32.2;ql3.3ql 1.2). The proband's mother had a balanced translocation but 
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was phenotypically normal without schizophrenia or eye-tracking abnormalities. She 
had the derivative chromosome 1 with extra material irom chromosome 5 inserted but 
a corresponding deletion in one of her chromosomes 5. She thus had only two copies 
of 5ql 1.2-5ql3.3. Further studies [Gilliam et al.. Genomics, 5:940-944 (1989)] 
5 showed that the DHFR gene is located within this deleted region, 5qll.2-13.3. 
Another schizophrenia chromosome abnormality, inv5(pl3;ql3), has been reported 
[Bassett, Br. J. Psychiatry, 161:323-334 (1992)] affecting 5ql3. 

On chromosome 5, two-point lod scores of 4.64 and 2.29 were found [Sherrington et 
al., Nature, 336:164-167 (1988)] for the polymorphic markers D5S76 and D5S39 

1 0 respectively in the region of the chromosome abnormality just discussed [McGillivray 
etal, Am. J. Med. Genet., 35:10-13 (1990); Bassett, Br. J. Psychiatry, 161:323-334 
(1992)] affecting 5ql 1.2-13.3. Two other linkage studies found small positive lod 
scores in this region [Coon et al, Biol. Psychiatry, 34:277-289 (1993); Kendler and 
Diehl, Schizophr. Bull, 19:261-285 (1993)], but numerous other studies excluded this 

15 region under the assumptions and models used [Kendler and Diehl, Schizophr. Bull, 
19:261-285 (1993)]. 

On chromosome 18, TYMS has been located at 1 8pl 1 .32-pl 1 .22. A ring 
chromosome with deletion of 18pter-pll,18q23-qter [Bassett, Br. J. Psychiatry, 
161:323-334 (1992)] was reported in a kindred with schizophrenia and bipolar illness 
20 [Bassett, Br. J. Psychiatry, 161:323-334 (1992)]. Deletion of a segment of 18p was 
reported in a schizophrenia chromosome [Bassett, Br. J. Psychiatry, 161:323-334 
(1992)]. 

On chromosome 22, TCNII has been located at 22ql 1 .2-ql3, possibly at the 22ql2/13 
border. High lod scores have consistently been obtained in the region of TCNII: 
25 IL2RB, in 22ql2-ql3. 1 gave a lod score [Pulver et al, Am. J. Med. Genet., 54:3-43 
(1994)] of 2.82. Other markers over abroad region of 22q have given suggestive lod 
scores. D22S278, in 22ql2, gave a lod score [Vallada et al. Am. J. Med. Genet., 
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60:139-146 (1995)] of 1.51. CRYB2, in22qll.2-ql2.1, gave a lod score [Lasseterez^ 
al. Am. J. Med. Genet. 60:172-173 (1995)] of 1.71. D22S10, in 22qll.l-qll.2, gave 
a lod score [Coon et al. Biol. Psychiatry. 34:277-289 (1993)] of 0.79. Highly 
significant p-values for non-parametric analyses have also been obtained: D22S278, 
5 in 22ql2, for example gave p=.001 [Gill et al. Am. J. Med. Genet, 67:40-45 (1996)]. 

The deletions of velocardiofacial (VCF) syndrome and related disorders (DiGeorge 
syndrome (DGS) and CATCH22) are located [Lindsay etal. Genomics. 32:104-112 
(1996)] at 22ql 1 .2. A psychotic disorder develops in about 10% of patients with 
VCF syndrome [Chow et al. Am. J. Med Genet. 54:107-1 12 (1994)]. TCNII is not 

1 0 known to be located at or within these deletions. VCF and related disorders are 
relatively uncommon compared to schizophrenia; only 2 of 100 randomly selected 
patients (92 schizophrenics, 5 with schizoaffective disorder, and 3 with 
schizophreniform disorder) in the Maryland Epidemiological Sample were found 
[Lindsay etal, Am. J. Hum. Genet.. 56:1502-1503 (1995)] to have VCF-related 

1 5 deletions (and later VCF syndrome) on 22ql 1 .2. Consequently, it is not clear 
whether schizophrenia linkage studies are detecting a haplotype related to a VCS 
locus or some other locus in this region, such as TCNII. 

For some other folate, cobalamin, or pyridoxine relevant genes, physical or genetic 
studies of schizophrenia have identified chromosomal regions near the gene. 

20 DISCUSSION 

The folate-cobalamin hypothesis for schizophrenia is attractive because it suggests 
that a single mechanism of genetic and environmental factors may play a major role 
in the etiology and pathogenesis of schizophrenia. The combined result of this 
mechanism is to damage fetal development, especially brain development by 

25 inhibiting nucleic acid synthesis, by affecting gene methylations, by increasing 
susceptibility to infection, and/or by producing teratogens. 
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This mechanism addresses several puzzling features of schizophrenia such as the 
season of birth effect, the association with famine and influenza epidemics, the 
negative association with rheumatoid arthritis, the associations with obstetrical 
abnormalities, social class, and urban environment. The mechanism also suggests 
5 approaches to diagnostic testing, to prevention, and to improved therapy. 

It is not excluded that such a mechanism could also apply to a number of common 
human developmental disorders that have been shown to have a genetic component to 
their etiology but whose mode of inheritance has been difficult to determine and for 
which linkage studies have met with unexpected difficulties or have achieved limited 

10 success. These developmental disorders include Tourette's syndrome & related 
disorders (e.g. obsessive-compulsive disorder and chronic multiple tics syndrome) 
[Pauls, Adv Neurol, 58:151-157 (1992); McMahonet al., Adv Neurol, 58:159-165 
(1992); Heutink et al. Am J Hum Genet, 57:465-473 (1995); Grice et al, Am J Hum 
Genet, 59:644-652 (1996)], learning disorders, including dyslexia [Lewis, et al, 

15 Behav Genet, 23:291-297 (1993); Vermington, J Child Neurol 10 Suppl, 1:S69-S77 
(1995)], conduct disorder [Lombroso et al, J. Am. Acad. Child Adolesc. Psychiatry, 
33:921-938 (1994)], attention-deficit hyperactivity disorder [Lombroso et al, 1994, /. 
Am. Acad. Child Adolesc. Psychiatry, 33:921-938 (1994)], bipolar illness [Baron, 
Acta. Psychiatr. Scand., 92:81-86 (1995); Benjamin and Gershon, 5zo/. Psychiatry, 

20 40:313-316 (1996); Risch and Botstein, Nature Genet., 12:351-353 (1996); Jamison 
and Mclnnis, Nature Med., 2:521-522 (1996); Morell, Science, 272:31-32 (1996)], 
autism [Lombroso et al, 1994, J. Am. Acad. Child Adolesc. Psychiatry, 33:921-938 
(1994)], and obsessive-compulsive disorder in adults [Lombroso et al, 1994, J. Am. 
Acad. Child Adolesc. Psychiatry, 33:921-938 (1994)]. Some of these disorders have 

25 been shown to be associated with schizophrenia. 

The present invention may be better understood by reference to the following non- 
limiting Examples, which are provided as exemplary of the invention. The following 
Examples are presented in order to more fully illustrate one embodiment of the 
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invention. They should in no way be construed, however, as Hmiting the broad scope 
of the invention. 

EXAMPLE 1 

DIAGNOSING SCHIZOPHRENIA 

5 Structure of Datafiles 

Data are arranged in a file suitable for input into a binary logistic regression program 
(Table 8). A model is created consisting of those explanatory variables actually 
available from the specific patient-to-be-diagnosed and family members participating 
in the testing. This new combined data set (reference data set + data from 
10 patient-to-be-diagnosed with participating family members) is analyzed by binary 
logistic regression for the model chosen giving the predicted probability that a 
proband is affected with schizophrenia for all of the probands including the 
patient-to-be-diagnosed. 

The model can be modified if required. The goodness of fit for the 
1 5 patient-to-be-diagnosed is checked. The predicted probability that the 

patient-to-be-diagnosed has schizophrenia is compared with a classification table 
generated from the model used to determine likelihood of false positives and false 
negatives. The predicted probabihty that the patient-to-be-diagnosed is affected with 
schizophrenia, with likelihood of false positive or false negative result, is returned to 
20 the clinician. 
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TABLES 

A HYPOTHETICAL PARTIAL REFERENCE DATA SET OF GENETIC 
EXPLANATORY VARIABLES TO ILLUSTRATE DATA STRUCTURE 



ID 


resp 


Pill 


P112 


P211 


P212 


Mill 


M112 


M311 


F511 


S2-411 




1 




1 


0 


1 


1 


1 


1 


0 


0 


1 


1 


2 




1 


0 


0 


0 


0 


0 


0 


1 


0 


0 


3 




1 


1 


1 


0 


1 


0 


0 


1 


1 


1 


4 




0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


5 




0 


0 


1 


1 


1 


1 


0 


0 


0 


1 


6 


0 


1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


7 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


8 


0 


0 


0 


1 


0 


0 


0 


0 


1 


1 


0 


9 


0 


1 


0 


0 


0 


1 


0 


0 


0 


1 


1 


10 


0 


0 


0 


0 


0 


1 


0 


0 


0 


1 


0 



1] ... 

For each proband (Table 8), the record contains several variables: 
identification number (ID) of the proband. 

a binary response variable (resp) for affection status of the proband: response=l, if the 
proband is affected with schizophrenia; response=0 if proband is unaffected (i.e. a control individual). 
The proband is not necessarily one of the individuals for whom genotype data (explanatory variables) 
are available. The patient-to-be-diagnosed is assigned response=0 when added to the reference data 
set. 

a set of explanatory variables: i.e. sets of genotypes of mutations found in the schizophrenia 
patients and family members and controls and family members. The schizophrenia patients and the 
control individuals are probands (P) as is the patient-to-be-diagnosed. Unaffected family members are 
the proband's mother (M), father (F), sib(s) (SI, S2, etc.), child(ren) (CI, C2, etc.) or other relatives. 
Data for affected family members, e.g. the proband's mother (MA), father (FA), sibs (SAl, SA2, etc.), 
children (CAl, CA2, etc.), or other relatives, are entered as separate explanatory variables. 
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Genetic explanatory variables: Each individual has 0, 1, or 2 copies of any given 
mutation allele at a given locus. Thus a genotype at each locus contributes two 
independent explanatory variables. Most of the affected family members will be 
relatives of schizophrenia probands, but occasionally a relative of an unaffected 
5 proband v^ill turn out to be affected with schizophrenia. 

Mutations are tabulated as explanatory variables: {see Table 8): 

(i) by the proband or relative in whom they occur, (e.g. P, M, F, S2, CI, MA, FA, 
SAl, CAl, other); 

(ii) by the specific folate, cobalamin, or pyridoxine gene locus in which they 

1 0 occur (e.g. 1=DHFR locus, 2=MTHFR locus, 3=TCN2 locus, 4=MTR locus, 5-CBS 
locus, etc.); 

(iii) by the specific mutation within a locus (e.g., l=the first-designated mutation 
within a locus, 2=the second-designated mutation within a locus, etc.); and 

(iv) by whether the individual has a single or double dose of the mutation. Thus 
15 an explanatory variable P32 1 records whether the proband has a single dose of the 

second-designated mutation of the third-designated locus, i.e. TCN2. A variable 
M312 records whether the proband's mother has a double dose of the first-designated 
TCN2 mutation studied. 

In the present hypothetical reference dataset illustrated of genetic explanatory 
20 variables (Table 8), partial genotype data for probands, mothers, fathers, sibs and 

children are given for five gene loci. Not all of the possible explanatory variables are 
shown. Probands 1-5 are unrelated individuals with the definite clinical diagnosis of 
schizophrenia; probands 6-10 are unrelated unaffected (control) individuals. 
Probands 1, 2, 3, 6 and 9 all have a single copy of the first-designated DHFR 
25 mutation; proband 3 also has a second copy of that mutation. Probands 1, 3, 5 and 8 
all have a single copy of the first-designated mutation at the MTHFR locus; probands 
1 and 5 also have a second copy of that mutation. Mothers of probands 1, 3, 5, 9 and 
10 all have a single copy of the first-designated DHFR mutation; mothers of probands 
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1 and 5 also have a second copy of this mutation. Mothers of probands 4 and 7 each 
have a single copy of the first-designated mutation of TCN2; data for a double dose 
are not shown. The fathers of probands 2, 3, and 8 each have a single copy of the first 
designated mutation of CBS; data for a double dose are not shown. The second 
5 (unaffected) sibs of probands 1, 3, 8, 9, and 10 each have a single copy of the 

first-designated mutation of MTR; data for a double dose are not shown. The first 
affected children of probands 1,3,5, and 9 each have a single copy of the 
first-designated mutation of DHFR. Other susceptibility loci and mutations can be 
incorporated in Table 8 in the same fashion e.g., cj^okine gene mutations or 
10 polymorphisms, or major histocompatibiUty complex (MHC) mutations or 
polymorphisms. 

Environmental explanatory variables: If only genetic explanatory variables 
(genotype data) are used, the maximum predicted probability that the proband is 
affected with schizophrenia is expected to be approximately about 0.5 in most 
1 5 populations. When environmental risk factors are included as explanatory variables, 
the maximum predicted probabihty that the proband is affected with schizophrenia 
may approach 1.0. Examples of environmental risk factors for a schizophrenia patient 
include: 

(1) the proband's dietary folate/cobalamin/pyridoxine intake. 
20 (2) the proband's circulating levels of folate/cobalamin/pyridoxine. 

(3) the proband's circulating levels of homocysteine, methylmalonic acid, or 
cystathionine. Elevated levels are indicators of subtle folate/cobalamin deficiency. 

(4) the proband's mother's dietary folate/cobalamin/pyridoxine intake at the time 
of patient diagnosis, during a pregnancy, or during the pregnancy that produced the 

25 proband. 

(5) the proband's mother's circulating levels of homocysteine, methylmalonic 
acid, or cystathionine at the time of patient diagnosis, during a pregnancy, or during 
the pregnancy that produced the proband. 
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(6) dietary or circulating folate/cobalamin/pyridoxine or circulating levels of 
homocysteine, methylmalonic acid, or cystathionine for other family members. 

(7) epidemiological factors related to the proband's gestation and birth, e.g. low 
birth weight or preterm birth, maternal infection, maternal smoking (associated with 

5 low plasma folate), season of birth (late winter or spring births are more common in 
schizophrenia), etc. 

Method of Data Analysis 
The method exemplified herein is based upon the pubhshed guide for the S AS 
system, but other software can be used. The dataset is analyzed using binary logistic 
1 0 regression to model the response probabihty, p;, that the ith proband's affection status 
is 1, i.e. the probability that the ith proband has schizophrenia, given the vector of 
explanatory variables, X;. That is: 

Pi = Prob(yi=l|Xi). 

To do this the logit transformation of p' is modeled as a linear function of the 
1 5 explanatory variables in the vector, X;: 

logit (p,) = log (pi/[l-pi]) = alpha + beta'xj 
where: alpha is the intercept parameter and 

beta is the vector of slope parameters. 
In SAS, the "descending" option is used to model the probability that the response=l, 
20 as in the present analysis, rather than response=0. 

Outputs of binary logistic regression analysis 
After analysis of a dataset, the outputs obtained from SAS include: 

(a) Estimates and standard errors of the parameters (alpha and beta). 
Using estimates of the intercept parameter (alpha) and the slope parameter (beta) for 

25 each environmental or genetic risk factor, the logistic regression equation for the 
dataset can be written. 

(b) Significance tests of the parameters (e.g. Wald chi-square). From the 
corresponding p-values, the level of significance of each of the environmental or 
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genetic risk factors is determined. A global significance test of the data with 
corresponding p-value is also determined. 

(c) Odds ratios are given for the slope parameters of each environmental 
or genetic risk factor. Thus the amount contributed by each environmental or genetic 
risk factor to the risk of schizophrenia is determined. 

(d) The confidence limits for regression parameters and odds ratios are 
determined. 

(e) The predicted probabilities of the observations can be computed, i.e. 
the probability that each individual in the dataset has schizophrenia: 

alpha- = estimate of the intercept parameter; 
beta- = vector of the estimates of the slope parameters; 
X = vector of the explanatory variables; 
p~ = predicted probabilities 
1 

^ 1 + exp(alpha~ - beta-'x)"" 

(f) The model is modified by adding or removing variables until a model 
is found that best fits the data; 

(g) The modelis tested for goodness-of-fit. Also, the degree of influence 
of each specific observation is tested to detect extreme or ill-fitting observations. 
These may be examples of data entry errors or alternatively, observations that do not 
fit the present model for schizophrenia. 

(h) The probability that a new individual (the patient-to-be-diagnosed) is 
schizophrenic is then calculated fi-om the final, modified, best fitting regression 
equation based upon parameters derived fi-om a corrected/modified data set. A simple 
method of doing this is to add the data for the patient-to-be-diagnosed to the reference 
data set, a large group of well-studied schizophrenia probands, schizophrenia family 
members, control probands and control family members for whom data are available 
for many explanatory variables. A model is created consisting of those informative 
explanatory variables actually available firom the specific patient-to-be-diagnosed and 
family members participating in the testing. This new combined data set (reference 



data set + data from patient-to-be-diagnosed with participating family members) is 
analyzed by binary logistic regression for the model chosen giving the predicted 
probability that a proband is affected with schizophrenia for all of the probands 
including the patient-to-be-diagnosed. 

(i) A classification table is produced from the data set by the "jack 
knifing" procedure or an approximation to it. This procedure classifies each 
observation as an event or nonevent based on the model that omits the observation 
being classified. A classification table sorts observations into percent correct, percent 
false positives, and percent false negatives at various probability levels and computes 
sensitivity and specificity. 

0) The data set used for diagnostic testing is constantly being updated 
and the regression equation corrected. For example, sfratification by geographic 
residence or geographic origin of ancestors must be considered for some 
environmental or genetic risk factor. 

For example, in Table 9, entries 34-43 are shown for the data file containing 
genotypes of 38 schizophrenic probands plus 211 control probands; the first 38 are 
the affected probands. For individual 302088, the proband is affected ("1"); there is a 
single dose ("1") of the DHFR mutation but not a double dose ("0") and a single dose 
("1") of the MTHFR mutation but not a double dose ("0"). The number 302088 
identifies the individual whose genotypes are listed; the proband, in this case, is the 
same individual. 
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TABLE 9 



SAS DATAFILE FOR SCHIZOPHRENIA PATIENTS AND CONTROLS 



34 


302086 


1 


1 


0 


1 


1 


35 


302088 


1 


1 


0 


1 


0 


36 


302110 


1 


1 


0 


1 


0 


37 


302111 


1 


1 


0 


0 


0 


38 


302136 


1 


1 


1 


1 


0 


39 


100001 


0 


1 


0 


0 


0 


40 


100061 


0 


0 


0 


0 


0 


41 


100064 


0 


1 


0 


1 


0 


42 


100067 


0 


0 


0 


1 


0 


43 


100073 


0 


1 


0 


0 


0 



In Table 10, entries 31-40 are shown for the data file containing genotypes of 35 
20 mothers of schizophrenic probands plus (the same) 21 1 control probands. For 

individual 302083, the proband is affected ("1"); there is a single dose of the DHFR 
mutation ("1) but not a double dose ("0"); there is neither a single ("0") nor a double 
("0") dose of the MTHFR mutation. The number 302083 identifies the individual 
whose genotypes are listed, a mother; the proband, in this case, is a different 
25 individual, her affected child. 
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TABLE 10 



SAS DATAFILE FOR SCHIZOPHRENIA MOTHERS AND CONTROLS 



31 


302083 


1 


1 


0 


0 


0 


32 


302103 


1 


0 


0 


1 


0 


33 


302104 


1 


0 


0 


1 


0 


34 


302105 


1 


1 


0 


1 


0 


35 


302120 


1 


0 


0 


0 


0 


36 


100001 


0 


1 


0 


0 


0 


37 


100061 


0 


0 


0 


0 


0 


38 


100064 


0 


1 


0 


1 


0 


39 


100067 


0 


0 


0 


1 


0 


40 


100073 


0 


1 


0 


0 


0 



In Table 11, entries 1 1-20 are shown for the data file containing genotypes of 15 
fathers of schizophrenic probands plus (the same) 21 1 control probands. For 
20 individual 302084, the proband is affected (" 1 "); there is a single dose (" 1 ") but not a 
double dose ("0") of the DHFR mutation; there is both a single ("1") and a double 
dose ("1") of the MTHFR mutation. The number 302084 identifies the individual 
whose genotypes are listed, a father; the proband, in this case, is a different 
individual, his affected child. 
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TABLE 11 

SAS DATAFILE FOR SCHIZOPHRENIA FATHERS AND CONTROLS 



11 


302102 


1 


0 


0 


0 


0 


12 


302106 


1 


1 


0 


0 


0 


13 


302115 


1 


1 


0 


0 


0 


14 


302117 


1 


1 


0 


0 


0 


15 


302084 


1 


1 


0 


1 


1 


16 


100001 


0 


1 


0 


0 


0 


17 


100061 


0 


0 


0 


0 


0 


18 


100064 


0 


1 


0 


1 


0 


19 


100067 


0 


0 


0 


1 


0 


20 


100073 


0 


1 


0 


0 


0 



In Table 12, entries 9-18 are shown for the data file containing genotypes of 13 
unaffected sibs of schizophrenic probands plus (the same) 21 1 control probands. For 
20 individual 302089, the proband is affected (" 1 "); there is a single dose (" 1 ") but not a 
double dose ("0") of the DHFR mutation; there is both a single ("1") and a double 
dose ("1") of the MTHFR mutation. The number 302089 identifies the individual 
whose genotypes are listed, an unaffected sib; the proband, in this case, is a different 
individual, the affected sib of individual 302089. 
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TABLE 12 

SAS DATAFILE FOR SCHIZOPHRENIA SIBS AND CONTROLS 



09 


302071 




1 


1 


0 


0 


10 


302073 


1 


0 


0 


1 


0 


11 


302089 


1 


1 


0 


1 


1 


12 


302118 


1 


1 


0 


0 


0 


13 


302126 


1 


1 


0 


0 


0 


14 


100001 


0 


1 


0 


0 


0 


15 


100061 


0 


0 


0 


0 


0 


16 


100064 


0 


1 


0 


1 


0 


17 


100067 


0 


0 


0 


1 


0 


18 


100073 


0 


1 


0 


0 


0 



In Tables 9-12 for individual 100061, the proband is unaffected ("0"); there is neither 
a single dose ("0") nor a double dose ("0") of the DHFR mutation; there is neither a 
20 single dose ("0") nor a double dose ("0") of the MTHFR mutation. Since the proband 
is unaffected, this is a control individual. The number 100061 identifies the individual 
whose genotypes are listed, as a control individual; the proband, in this case, is the 
same individual. The identical group of control individuals is used for all four 
comparisons. 
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EXAMPLE 2 

Distribution of Folate Gene Polymorphism Genotypes Among Schizophrenics. 
Schizophrenia Parents, Schizophrenia Sibs, and Controls 

Summary 

5 The DNA polymorphism-Diet-Cofactor-Development hypothesis (DDCD hypothesis, 
described above) postulates that schizophrenia results in part from developmental 
brain damage sustained in utero from the aggregate effect of maternal defects of 
genes related to important co factors, e.g. folate, cobalamin, pyridoxine, potentiated by 
a maternal dietary deficiency of these cofactors. The maternal damage to the fetus 
10 results in part from insufficiency of these cofactors themselves and in part from 
resulting effects such as immune deficiency and maternal teratogens, e.g. 
hyperhomocysteinemia. Genes from either parent acting in the fetus may modify 
these damaging effects as outlined in the gene-teratogen model (described above). 

The hypothesis addresses all of the unusual biological and epidemiological features of 
15 schizophrenia: e.g. the decreased amount of grey matter in brain areas, the unusual 
birth-month effect, the geographical differences in incidence, the socioeconomic 
predilection, the association with obstetrical abnormalities (low birth weight and 
prematurity), the decreased incidence of rheumatoid arthritis, and the association with 
viral epidemics (described above). 

20 The hypothesis can be supported by finding significant association of sequence 
variants of folate, cobalamin, or p3aidoxine genes with schizophrenia. Folate, 
cobalamin, and pyridoxine absorption, fransport, and metabolism are complex 
[Rosenblatt, In: The Metabolic and Molecular Bases of Inherited Disease, Scriver et 
al. (eds). New York: McGraw-Hill, pp. 3111-3128 (1995); Benton and Rosenberg, In: 

25 The Metabolic and Molecular Bases of Inherited Disease, Scriver et al. (eds). New 
York: McGraw-Hill, pp. 3129-3149 (1995); Whyte etal, Hypophosphatasia, In: The 
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Metabolic and Molecular Bases of Inherited Disease, Scriver et al. (eds). New York: 
McGraw-Hill pp. 4095-41 1 1] with multiple transport proteins, enzymes, and 
regulatory components. A strong candidate for harboring a mutation predisposing to 
schizophrenia is the DHFR gene coding for the folate enzyme dihydrofolate 
5 reductase. DHFR chemically reduces dietary folate converting it into a form that can 
enter cellular metabolism. DHFR is also important for DNA synthesis and is known 
to play a major role in development in utero. A novel polymorphic 19 basepair 
deletion of the DHFR gene has been isolated which could be of functional 
significance because it affects potential transcription factor binding sites. 

1 0 A second candidate is the MTHFR gene, coding for methylenetetrahydrofolate 
reductase, MTHFR, an important enzyme of folate metabolism. MTHFR was of 
particular interest because severe deficiency of enzyme activity has been associated 
with the "schizophrenia" phenotype [Freeman et al, N. Engl. J. Med., 292:491-496 
(1975); Regland et al, J. Neural Transm. Gen. Sect.. 98:143-152 (1994)] and because 

1 5 a common mutation, the nt677 C->T transition results in a mutated gene that encodes 
a heat-labile MTHFR, having decreased enzymatic activity, which in the presence of 
dietary folate deficiency, causes the plasma homocysteine of homozygotes to become 
elevated [van der Put et al, Lancet., 346:1070-1071 (1995); Frosst etal. Nature 
Genet., 10:1 11-1 13 (1995)]. In adults, hyperhomocysteinemia is known to cause 

20 vascular disease and to be toxic [Frosst et al, Nature Genet, 10:11 1-113 (1995)]. 

Therefore, homocysteine that crosses the placenta could act as a fetal teratogen during 
pregnancy. Maternal folate deficiency could also have a more direct teratogenic 
effect through fetal folate deprivation. These effects could be potentiated by 
abnormalities of other folate, cobalamin, or pyridoxine genes, even if these 

25 abnormalities were only minor. 



Materials & Methods : 
1. Subjects and Sample Collection: Patients with schizophrenia and unaffected family 
members of schizophrenics, were ascertained from patient facilities, patient support 
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groups, and family support group organizations. Nearly all schizophrenia families 
had only a single case of schizophrenia. The patients came from different 
schizophrenia families than the parents and sibs. The controls were unaffected and 
unrelated individuals not known to be schizophrenic or related to patients with 
5 schizophrenia or spina bifida. All subjects were of Caucasian background except two 
of the schizophrenia patients who were of African American background. 

After informed consent was obtained, 20-40 ml of blood was collected into EDTA 
(purple-top) vacutainers, placed on ice immediately, and transported to the laboratory 
where plasma, packed red cells, and buffy coat were separated by centrifiigation and 
10 frozen at -SOT. 

2. Detection of Alleles: DNA was isolated using the QIAmp column DNA extraction 
procedure or the QIAGEN Genomic-tip method (QIAGEN, Chatsworth, CA). Alleles 
for a newly detected polymorphic 19 bp deletion in the dihydrofolate reductase 
(DHFR) gene were determined by polymerase chain reaction (PGR) amplification of 

1 5 the region surrounding the deletion using specific primers (Fig 1) and direct detection 
of the PGR products after separation of products on a non-denaturing polyacrylamide 
gel. A Cetus - Perkin-Elmer 9600 thermocycler was used. Briefly, the PGR reaction 
contained 200 uM dNTPs, 1.5 mM MgClj, 10 pmols of each primer, in 10 ul reaction 
volume. The PGR conditions used were denaturation at 94°C for 6 min. initially, 

20 followed by 35 cycles of 94°G for 55 sec, 6QrC for 55 sec, and 72°G for 55 sec. and a 
final extension at 72°G for 12 min. 

Alleles for the 677G->T transition of the methylenetetrahydrofolate reductase 
(MTHFR) gene were determined by cleavage with the restriction endonuclease, 
Hinfl, of PGR-amphfied genomic DNA from blood and separation of the products by 
25 non-denaturing polyacrylamide gel elecfrophoresis [Frosst et al. Nature Genet., 
10:111-113 (1995)]. 



3. Sequencing the Region Around the DHFR Deletion: Using the same primers 
(Figiire 1), genomic DNA from individuals with 1,1 and 2,2 genotypes was amplified 
by PGR and the products sequenced using an ABI PRISM 377 automated sequencer. 
Restriction sites were identified using the MAP Program in the GCG Package. 
Potential transcription factor binding sites were detected with the TESS program 
(transcription element search software, 
URL:http://agave.humgen.upenn.edu/tess/index.html). 

4. Data Analysis: Since the mode of inheritance of schizophrenia is unknown, binary 
logistic regression was used to test the DHFR deletion allele and the MTHFR 
heat-labile allele as genetic risk factors for schizophrenia. Either the DHFR deletion 
polymorphism or the MTHFR heat-labile allele could itself be a genetic risk factor for 
schizophrenia. The genotypes of the two folate gene polymorphisms were used as 
explanatory variables. Genotypes of schizophrenia patients, parents, or sibs were 
compared with those of contiols. 

Four files were constiiicted consisting of schizophrenia patients+controls, mothers of 
schizophrenia patients+controls, fathers of schizophrenia patients+controls, and sibs 
of schizophrenia patients+conti-ols for input into the SAS System. Each dataset 
contained 6 variables. In order, these were: 

1 . six digit identification (ID) number; 

2. response variable, i.e. affection status of the proband 
(0=unaffected, i.e. control individual; l=affected, i.e. schizophrenia patient); 

3 . DHFR mutation-single dose (Ds); 

4. DHFR mutation-double dose (Dd); 

5. MTHFR mutation-single dose (Ms); and 

6. MTHFR mutation-double dose (Md). 

For mutation data, O=mutation absent, l=mutation present. 
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Results 

Alleles of the DHFR 19 bp Deletion Polymorphism: Amplification of the region of 
intron 1 of DHFR defined by the primers in Figure 1 gave two polymorphic bands of 
232 and 213 bp after separation on a non-denaturing polyacrylamide gel (Figure 2). 
5 Sequencing the PCR products from the two homozygotes showed that they differed 
by 19 bp (Figure 3). The upper and lower bands (Figure 2), non-deletion allele and 
deletion allele respectively, were designated alleles 1 and 2 respectively. Comparison 
with two published sequences showed that allele 1 was identical with one of them 
[Yang et al J. Mol. Biol. 176:169-187 (1984)] indicating that allele 2 resulted firom a 
10 19 bp deletion. The other published sequence [Chen et al. J. Biol. Chem. 259:3933- 
3943 (1984)] was lacking one base pair of allele 1, an A indicated by "*" in Fig 3. It 
is possible that this shorter reference sequence [Chen et al. J. Biol. Chem. 259:3933- 
3943 (1984)] resulted from a sequencing artifact. 

Sequences in the 19 bp Deleted Region of DHFR Intron 1: The 19bp sequence in the 
15 deleted region (Fig 3) of DHFR intron 1 contained sites for several restriction 

enzymes including Rsal and ScrFI, and potential binding sites for transcription factors 
including Spl, NF-kappaB, CPl (NF-Y), E2F, ETF and GCF in the 19 base pair 
region. 

Binary Logistic Regression Analysis: The number of individuals with each genotype 
20 of the two polymorphisms among 38 unrelated schizophrenia probands, 35 unrelated 
mothers of schizophrenia probands, 15 imrelated fathers of schizophrenia probands, 
1 3 unrelated unaffected sibs of schizophrenia probands, and 211 unrelated unaffected 
control probands is shown in Table 13. 
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TABLE 13 

DISTRIBUTION OF DHFR AND MTHFR MUTATION GENOTYPES 
AND ALLELES AMONG CONTROLS. SCHIZOPHRENICS. 
AND SCHIZOPHRENIA FAMILY MEMBERS 



DHFR 19 bp deletion polymorphism : 
-GenTyp- Schizophrenia — Ctrl 





P 


M 


F 


S 




1/1 


6 (.16) 


10 (.29) 


4 (.27) 


4 (.31) 


56 (.26) 


1/2 


22 (.58) 


13 (.37) 


11 (.73) 


8 (.61) 


115 (.54) 


2/2 


10 (.26) 


12 (.34) 


0 (0.0) 


1 (.08) 


40 (.19) 


total 


38 (1.00) 


35 (1.00) 


15(1.00) 


13 (1.00) 


211 (.99) 



MTHFR 677C->T transition polymorp hism: 
-GenTyp- Schizophrenia — Ctrl— 





P 


M 


F 


S 




1/1 


14 (.37) 


16 (.46) 


11 (.73) 


4 (.31) 


103 (.49) 


1/2 


18 (.47) 


18 (.51) 


3 (.20) 


8 (.61) 


78 (.37) 


2/2 


6 (.16) 


1 (.03) 


1 (.07) 


1 (.08) 


30 (.14) 


total 


38 (1.00) 


35 (1.00) 


15 (1.00) 


13 (1.00) 


211 (1.00) 



P-schizophrenia patients; M=mothers of schizophrenia patients; F=fathers of 
schizophrenia patients; S=imaffected sibs of schizophrenia patients; Ctrl=control 
individuals. 
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The four data files were analyzed using the logistic procedure of SAS (SAS Institute 
Inc., 1995) and the "descending" option, which modeled the probability that 
RESPONSE-1, that is, the probabihty that the proband was affected with 
schizophrenia. Note that the proband was not always the individual whose genotype 
data were used. For example, genotype data for mothers of schizophrenic probands 
were used to determine the probability that their children, the probands, were affected. 
Use of the "best" model selection options for logistic analysis in SAS gave the best 
models for two and three explanatory variables, (Table 14). 



Table 14 

BINARY LOGISTIC REGRESSION REST JTTS 

GENETIC RISK FACTOR MODEL: Ds Dd Ms Md 
Odds Ratio (p value) 

Schizophrenia Patients 

DsOR(p) 1.937 (.18) 

DdOR(p) 1.263 (.59) 

MsOR(p) 1.775 (.14) 

MdOR(p) 0.914 (.86) 

Mothers of Schizophrenia Patients 
DsOR(p) 0.630 (.31) 

Dd OR(p) 2.653 (.028)* 

MsOR(p) 1.439 (.34) 

MdOR(p) 0.143 (.065) 

Fathers of Schizophrenia Patients 
DsOR(p) 1.178 (.79) 

Dd OR(p) 0.000 (.96) 

MsOR(p) 0.366 (.14) 

Md OR(p) 0.841 (.88) 

Unaffected Sibs of Schizophrenia Patients 
DsOR(p) 1.104 (.88) 

DdOR(p) 0.337 (.31) 

MsOR(p) 2.688 (.12) 

MdOR(p) 0.317 (.29) 
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Notes For Table 14 



DHFR 19 bp deletion : Ds=single dose; Dd=double dose 

MTHFR 677C->T mutation : Ms=single dose; Md=double dose 

Logistic regression model: 

Model with four explanatory variables (Ms, Md, Ds and Dd). 

OR(p)=odds ratio and the corresponding p-value for that odds ratio 
determination *=significant at the p<.05 level. 

0.000 odds ratios occurred since none of the fathers of schizophrenia patients 
had genotype Dd; there was a possibly quasi- complete separation in the sample 
points; the maximum likelihood estimate may not exist; and therefore validity of the 
model fit for these odds ratios was questionable. 

The comparison of mothers of schizophrenia probands with control probands was 
statistically significant. Ds was not a significant genetic risk factor. Neither Ms nor 
Md in mothers was a significant genetic risk factor. However, the p-value for Md 
decreased and approached significance (p==.065) at the p<.05 level. 

5 Predicted Probabilities of the Various Genotypes: The "probs predicted" modality of 
SAS, gave the predicted probability that the proband was affected with schizophrenia 
(response=l) given genotype data for control probands and schizophrenia patients 
(probands), mothers of schizophrenia probands, fathers of schizophrenia probands, or 
sibs of schizophrenia probands. The maximum probabilities obtained are shown in 
10 Table 15. The highest maximum predicted probability that the proband was affected 
was obtained for genotype data from mothers of schizophrenia probands, next for 
schizophrenia probands, next for fathers of schizophrenia probands, and lowest for 
sibs of schizophrenia probands. 
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TABLE 15 

MAXIMUM PREDICTED PROBABILITY 



Model P M F S 

Ds Dd Ms Md 0.24 0.29 0.12 0.11 

Model and explanatory variables are the same as in Table 14. 

Determination of Genotypes Conferring the Highest Risk: The predicted probabilities 
that the proband was affected with schizophrenia given specific genotypes of control 
probands and schizophrenia probands, mothers of schizophrenia probands, fathers of 
schizophrenia probands, or sibs of schizophrenia probands were determined using the 
5 model containing all four explanatory variables (Table 16). The predicted 
probabilities that the proband was affected with schizophrenia were highest for 
maternal genotypes (Table 15). The maternal genotype with the highest risk was Dd 
Ms, conferring a probability of 0.29 of schizophrenia in the proband (Table 16). The 
Dd Ms genotype also gave the highest predicted probabiUty, 0.24, for schizophrenia 
10 patients. 
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TABLE 16 

PREDICTED PROBABILITIES FOR SPECIFIC GENOTYPES 
Model: Ds Dd Ms Md 



Predicted 
Probability 

0.07 
0.12 
0.11 
0.12 
0.15 



Genotype 

Schizophrenia Patients : 
Dnull + Mnull 
DnuU + Ms 
Dnull + Md 
Ds + Mnull 
Dd + Mnull 

Mothers of Schizophrenia Patients : 

Dnull + Mnull 0.16 

Dnull + Ms 0.20 

Dnull + Md 0.03 

Dd + Mnull 0.22 

Ds + Mnull 0.10 

Fathers of Schizophrenia Patients : 

Dnull + Mnull 0.10 

Dnull + Ms 0.04 

Dnull + Md 0.03 

Ds + Mnull 0.12 

Dd + Mnull 0.0 

Unaffected Sibs of Schizophrenia Patients : 

Dnull + Mnull 0.04 

Dnull + Ms 0.10 

Dnull + Md 0.03 

Ds + Mnull 0.04 

Dd + Mnull 0.02 



Genotype 



Dd 
Dd 



f Md 
+ Ms 
+ Md 



+ Ms 
+ Md 
+ Ms 
+ Md 



f Ms 
^-Md 



+ Ms 
+ Md 

+ Ms 
+ Md 



Predicted 
Probability 

0.20 
0.19 
0.24 
0.23 



0.13 
0.02 
0.29 
0.06 



0.05 
0.04 
0.0 
0.0 



0.11 
0.04 
0.04 
0.01 



Genotypes consist of the same explanatory variables described in Table 14 except that Dnull has no 
copy of the DHFR deletion and Mnull has no copy of the MTHFR 677C->T variant. Odds ratios of 
0.0 were unsatisfactory as described in Table 14. 
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Discussion 

Structure and Function of the DHFR 19 bp Deletion Polymorphism: DHFR 
polymorphisms have been reported previously [Feder et al., Nucl. Acids Res. 15:5906 
(1987); Detera-Wadleigh et at., Nucl. Acids Res. 17:6432 (1989)]. It is known that 
introns are important for message regulation e.g., sphcing, or as sites for binding 
transcription factors. Since the first intron is a relatively common location for 
regulatory elements, it is possible that the deleted region of DHFR intron 1 could play 
a role in regulation of DHFR or that the deletion could be a genetic risk factor for 
schizophrenia because it removes potential transcription factor binding sites. 
Abnormalities of transcription factors and their binding sites may play a role in 
disease. For example, a polymorphic Spl binding site in the collagen type I alpha 1 
gene has been associated with reduced bone density and osteoporosis [Grant et al , 
Nature Genet. 14:203-205 (1996)]. 

The Nature of the Putative Folate Genetic Risk Factors for Schizophrenia: Dd in the 
mother of a schizophrenia proband conferred significantly increased risk of 
schizophrenia in her child (Table 14). The findings that Dd was a genetic risk factor 
in mothers but not fathers of schizophrenia probands (Table 15) and that Dd in 
mothers gave a higher predicted probability than in schizophrenia patients, fathers or 
sibs (Tables 15 and 16) was consistent with the role of DHFR as a teratogenic locus 
according to the gene-teratogen model (described above). The finding that a double 
dose but not a single dose of the DHFR deletion in mothers was a genetic risk factor 
(Table 16) supported a recessive mode of action in the mother. A teratogenic locus 
acting in the mother can also act as a modifying or specificity locus in the fetus. 

Neither Ms nor Md in mothers of schizophrenia probands showed statistical 
significance as genetic risk factors for schizophrenia in probands (Table 14). However 
Md in mothers approached statistical significance (p=.065) and appeared to be 
protective (odds ratio 0.14), while Ms in mothers appeared to increase risk modestly 
(odds ratio 1.44, p=.34). 



97 

Role of Genetic and Environmental Factors in Schizophrenia: Since the probability 
that a schizophrenia co-twin is also affected is reported [Gottesman, Schizophrenia 
Genesis, Schizophrenia Genesis- The Origins of Madness, W.H. Freeman & Co. 
N.Y.(1991)] to be only 48%, a large part of the risk for schizophrenia would be 
5 anticipated to come from environmental factors. Therefore, some controls should 
have the genetic risk factors for schizophrenia but not be affected with schizophrenia. 
In the present data set, 6 of 35 schizophrenia mothers and 7 of 38 schizophrenia 
patients had Dd Ms, the genotype conferring the highest risk, compared with 1 5 of 
21 1 controls. Since this genotype gave predicted probabilities of schizophrenia in 
1 0 probands of 0.29 and 0.24 respectively, polymorphisms of DHFR and MTHFR could 
account for a considerable portion of the genetic component of the risk of 
schizophrenia. 

Relation of DHFR to Cytogenetic and Linkage Data for Schizophrenia: As discussed 
above, the DHFR gene has been located on chromosome 5 at 5ql 1 .2-13.2. A 

1 5 schizophrenia translocation was reported (McGillivray et al. 1 990; B assett, 1 992) 
affecting 5ql 1.2-5ql3.3. Also two-point lod scores of 4.64 and 2.29 were foimd 
[Sherrington et al, Nature, 336:164-167 (1988)] for the polymorphic markers D5S76 
and D5S39 respectively on chromosome 5, in this region [McGillivray et al, Am. J. 
Med. Genet., 35:10-13 (1990); Bassett, Br. J. Psychiatry, 161:323-334 (1992)]. Two 

20 other linkage studies foimd small positive lod scores in this region [Coon et al, Biol. 
Psychiatry, 34:277-289 (1993); Kendler and Diehl, Schizophr. Bull, 19:261-285 
(1993)], but numerous other studies excluded this region under the assumptions and 
models used [Kendler and Diehl, Schizophr. Bull, 19:261-285 (1993)]. Recently, 
new studies have found suggestive evidence for a potential susceptibility locus at a 

25 different region of 5q, 5q3 1 [Schwab et al, Nat. Genet. 1 1 :325-327 (1 997)] and 
5q22-31 [Straub et al, Molec Psychiatr. 2:148-155 (1997)]. 

The case-control study presented herein illustrates the usefulness of the DNA 
polymorphism-Diet-Cofactor-Development and the gene-teratogen models described 
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above. More importantly, the results presented herein, clearly fail to reject the specific 
models, i.e., that folate gene polymorphisms can play a role in the etiology of 
schizophrenia. 

The present invention is not to be limited in scope by specific embodiments described 
5 herein. Indeed, various modifications of the invention in addition to those described 
herein will become apparent to those skilled in the art from the foregoing description 
and the accompanying figures. Such modifications are intended to fall within the 
scope of the appended claims. 

Various pubhcations in addition to the immediately foregoing are cited herein, the 
10 disclosures of which are incorporated by reference in their entireties. 



We Claim: 
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11. A method of generating a genetic reference dataset for use in the 

2 determination of the predicted probability for an individual of having a susceptibility 

3 for a developmental disorder due to genetic factors or for developing a developmental 

4 disorder due to genetic factors or for having offspring that develop a developmental 

5 disorder due to genetic factors comprising: 

6 (a) collecting a biological sample from a human subject; wherein the 

7 human subject is selected from the group consisting of a diagnostic proband, a blood 

8 relative of the diagnostic proband, an affected proband, a blood relative of the 

9 affected proband, a confrol proband, and a blood relative of the control proband; 

10 wherein the biological sample contains nucleic acids and/or proteins from the human 

1 1 subject; 

12 (b) analyzing the nucleic acids and/or proteins from the biological sample; 

13 wherein said analyzing results in a partial or frill genotype for the alleles of the genes 

14 involved in folate, pyridoxine, and/or cobalamin metabolism; wherein said partial or 

15 frill genotype forms a dataset of genetic explanatory variables for the human subject; 

16 and 

17 (c) compiling the dataset of genetic explanatory variables from multiple 

18 human subjects into a genetic reference dataset. 

12. A method of generating a genetic and environmental reference dataset for use 

2 in the determination of the predicted probabiUty for an individual of having a 

3 susceptibility for a developmental disorder due to genetic factors and environmental 

4 factors or for developing a developmental disorder due to genetic factors and 

5 environmental factors or for having offspring that develop a developmental disorder 

6 due to genetic factors and environmental factors comprising: 

7 (a) obtaining dietary and epidemiological information for environmental 

8 explanatory variables for the human subjects of Claim 1 ; and 



1 

2 
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(b) combining said environmental explanatory variables with a genetic 
reference dataset for the human subjects. 



1 3. The method of Claim 2 wherein the developmental disorder is selected from 

2 the group consisting of schizophrenia, spina bifida cystica, Tourette's syndrome, 

3 dyslexia, conduct disorder, attention-deficit hyperactivity disorder, bipolar illness, 

4 autism, chronic multiple tic syndrome and obsessive-compulsive disorder. 

14. A method of generating an environmental reference dataset for use in the 

2 determination of the predicted probability for an individual of having a susceptibility 

3 for a developmental disorder due to environmental factors or for developing a 

4 developmental disorder due to environmental factors or for having offspring that 

5 develop a developmental disorder due to environmental factors comprising: 

6 (a) obtaining dietary and epidemiological information for environmental 

7 explanatory variables for a human subject; wherein the hximan subject is selected 

8 from the group consisting of a diagnostic proband, a blood relative of the diagnostic 

9 proband, an affected proband, a blood relative of the affected proband, a control 

10 proband, and a blood relative of the confrol proband; and 

1 1 (b) compiling a dataset of environmental explanatory variables from 

12 multiple human subjects into an environmental reference dataset for the human 

13 subjects. 

15. A method of estimating the genetic susceptibility of an individual to have or 

2 to develop a developmental disorder comprising: 

3 (a) collecting a biological sample from one or more participants; wherein 

4 a participant is either the individual or a blood relative of the individual; and wherein 

5 the biological sample contains nucleic acids and/or proteins of the participant; 

6 (b) analyzing the nucleic acids and/or proteins from the biological sample; 

7 wherein said analyzing results in a partial or full genotype for the alleles of the genes 
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8 involved in folate, pyridoxine, and/or cobalamin metabolism; and wherein said partial 

9 or full genotype forms a dataset of genetic explanatory variables for the participants; 

10 (c) adding the datasets of genetic explanatory variables obtained from 

1 1 steps (a) and (b) to a genetic reference dataset therein forming a combined genetic 

12 dataset; 

13 (d) formulating a model comprising the genetic explanatory variables 

14 obtained from the participants; and 

15 (e) analyzing the combined genetic dataset; wherein a predicted 

16 probability for the individual of having or developing a developmental disorder is 

17 determined; and wherein the genetic susceptibihty of an individual to have or to 

1 8 develop a developmental disorder is estimated. 

1 6. The method of Claim 5 wherein said analyzing the combined genetic dataset 

2 is performed by binary hnear regression. 

1 7. The method of Claim 6 ftirther comprising the step of : 

2 (f) modifying the model by adding or subtracting a genetic explanatory 

3 variable; and re-analyzing the combined genetic dataset by binary logistic regression; 

4 wherein a model is chosen that best fits the data. 

1 8. The method of Claim 7 further comprising the step of : 

2 (g) testing the model for goodness of fit. 

1 9. The method of Claim 8 wherein the binary linear regression is performed with 

2 the SAS system. 

1 10. The method of Claim 5 wherein the developmental disorder is selected from 

2 the group consisting of schizophrenia, spina bifida cystica, Tourette's syndrome, 

3 dyslexia, conduct disorder, attention-deficit hyperactivity disorder, bipolar illness, 

4 autism, chronic multiple tic syndrome and obsessive-compulsive disorder. 
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1 11. The method of Claim 1 0 wherein the developmental disorder is schizophrenia 

2 and the individual is suspected of being genetically susceptible of having or for 

3 developing schizophrenia. 

1 12. The method of Claim 1 1 wherein the individual is suspected of being 

2 genetically susceptible for having or for developing schizophrenia because a blood 

3 relative has schizophrenia. 

1 13. The method of Claim 1 2 wherein the blood relative is a parent, a sibling, or a 

2 grandparent. 

1 14. The method of Claim 1 3 wherein the blood relative is a parent and wherein the 

2 parent is the mother of the individual. 

1 15. A method of estimating the genetic and environmental susceptibihty of an 

2 individual to have or to develop a developmental disorder comprising: 

3 (a) collecting a biological sample from one or more participants; wherein 

4 a participant is either the individual or a blood relative of the individual; and wherein 

5 the biological sample contains nucleic acids and/or proteins of the participant; 

6 (b) analyzing the nucleic acids and/or proteins from the biological sample; 

7 wherein said analyzing results in a partial or full genotype for the alleles of the genes 

8 involved in folate, pyridoxine, and/or cobalamin metabolism; and wherein said partial 

9 or full genotype forms a dataset of genetic explanatory variables for the participants; 

10 (c) obtaining dietary and epidemiological information for environmental 

1 1 explanatory variables for the participants; wherein said information forms a dataset of 

1 2 environmental explanatory variables for the participants; 

13 (d) adding the datasets of genetic explanatory variables obtained from 

14 steps (a) and (b) and the dataset of environmental explanatory variables of step (c) to 

15 a genetic and environmental reference dataset therein forming a combined genetic and 

16 environmental dataset; 

17 (e) formulating a model comprising the genetic and environmental 

1 8 explanatory variables obtained from the participants; and 
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19 (f) analyzing the combined genetic and environmental dataset by binary 

20 logistic regression; 

2 1 wherein a predicted probability for the individual of having or developing a 

22 developmental disorder is determined; and wherein the genetic and environmental 

23 susceptibility of an individual to have or to develop a developmental disorder is 

24 estimated. 

1 16. The method of Claim 1 5 further comprising the step of : 

2 (g) modifying the model by adding or subtracting a genetic or 

3 environmental explanatory variable; and re-analyzing the combined genetic and 

4 environmental dataset by binary logistic regression; wherein a model is chosen that 

5 best fits the data. 

1 17. The method of Claim 16 further comprising the step of : 

2 (h) testing the model for goodness of fit. 

1 18. The method of Claim 1 7 wherein the binary hnear regression is performed 

2 with the SAS system. 

1 19. A method of estimating the susceptibiUty of an individual to have offspring 

2 that develop a developmental disorder comprising: 

3 (a) collecting a biological sample from one or more participants; wherein 

4 a participant is either the individual or a blood relative of the individual; and wherein 

5 the biological sample contains nucleic acids and/or proteins of the participant; 

6 (b) analyzing the nucleic acids and/or proteins from the biological sample; 

7 wherein said analyzing results in a partial or fall genotype for the alleles of the genes 

8 involved in folate, pyridoxine, and/or cobalamin metaboHsm; and wherein said partial 

9 or full genotype forms a dataset of genetic explanatory variables for the participants; 

10 (c) adding the datasets of genetic explanatory variables obtained from 

1 1 steps (a) and (b) to a genetic reference dataset therein forming a combined genetic 

12 dataset; 
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13 (d) formulating a model comprising the genetic explanatory variables 

14 obtained from the participants; and 

1 5 (e) analyzing the combined genetic dataset by binary logistic regression; 

1 6 wherein a predicted probabihty for the individual to have offspring that 

17 develop a developmental disorder is determined; and wherein the genetic and 

1 8 environmental susceptibiUty of an individual to have offspring that develop a 

19 developmental disorder is estimated. 

1 20. The method of Claim 1 9 further comprising the step of : 

2 (f) modifying the model by adding or subtracting a genetic explanatory 

3 variable; and re-analyzing the combined genetic dataset by binary logistic regression; 

4 wherein a model is chosen that best fits the data. 

1 21. The method of Claim 20 further comprising the step of : 

2 (g) testing the model for goodness of fit. 

1 22. The method of Claim 2 1 wherein the binary linear regression is performed 

2 with the SAS system. 

1 23. The method of Claim 22 wherein the individual is a pregnant woman. 



1 24. A method of lowering the risk of a pregnant woman who has been determined 

2 by the method of Claim 23 to be susceptible to have offspring that develop a 

3 developmental disorder comprising administering methylfolate, cobalamin or 

4 pyridoxine to the pregnant woman, wherein said administering lowers the risk of the 

5 pregnant woman of giving birth to offspring with a developmental disorder. 

1 25. A method of determining if any treatment is advisable for a pregnant woman 

2 who has been determined by the method of Claim 23 to be susceptible to having 

3 offspring that develop a developmental disorder comprising determining the 
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4 concentration of a risk factor from a tissue sample or body fluid from the pregnant 

5 woman; wherein when the concentration of the risk factor is statistically above or 

6 below an accepted normal range, treatment is advisable. 

1 26. A method of monitoring the effect of the administration of methylfolate, 

2 cobalamin or pyridoxine to the pregnant woman of Claim 25, comprising determining 

3 the concentration of a risk factor from a tissue sample or body fluid from the pregnant 

4 woman; and wherein when the concenfration of the risk factor is statistically within 

5 an accepted normal range, the treatment is effective. 

1 27. The method of Claim 26 wherein the risk factor is selected from the group 

2 consisting of homocysteine, folate, and cobalamin. 

1 28 . The method of Claim 22 wherein the individual is the mate of a pregnant 

2 woman. 

1 29. A method of treating an asymptomatic individual determined by the method of 

2 Claim 23 to be susceptible for developing a developmental disorder comprising 

3 administering methylfolate, cobalamin or pyridoxine. 

1 30. An isolated nucleic acid encoding a genetic variant of human dihydrofolate 

2 reductase comprising a nucleotide sequence having a 1 9 base-pair deletion spanning 

3 nucleotides 540 to 558 of the nucleotide sequence of SEQ ID NO:41 . 

1 31. The isolated nucleic acid of Claim 30 that has the nucleotide sequence of SEQ 

2 IDNO:42. 

1 32. An expression vector comprising the nucleic acid of Claim 30 operably 

2 associated with an expression control sequence, wherein the nucleic acid is selected 

3 from the group consisting of cDNA or genomic DNA. 
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33 . A PGR primer that can be used to distinguish SEQ ID NO:42 from the 
nucleotide sequence selected from the group consisting of SEQ ID N0:41 and SEQ 
ID NO:45. 

34. The PGR primer of Glaim 33 that comprises 1 0 to 50 consecutive nucleotides 
from the nucleotide sequence selected from the group of SEQ ID NO: 41, the 
complementary strand of SEQ ID NO: 41, SEQ ID NO:42, the complementary strand 
of SEQ ID NO: 42, SEQ ID NO:45, and the complementary strand of SEQ ID NO: 
45. 

35. The PGR primer of Glaim 34 wherein the 1 0 to 50 consecutive nucleotides are 
from nucleotides 350 to 530 of SEQ ID NO:41. 

36. The PGR primer of Glaim 35 having the nucleotide sequence of 5'-GTA AAG 
TGG ATG GTG GGT GTG-3' (SEQ ID NO:38). 

37. The PGR primer of Glaim 36 wherein the 10 to 50 consecutive nucleotides are 
from the complementary strand of nucleotides 550 to 850 of SEQ ID NO:41. 

38. The PGR primer of Glaim 37 having the nucleotide sequence of 5'-AAA AGO 
GGA ATG GAG TGG G-3' (SEQ ID NO:39). 

39. An isolated nucleic acid that hybridizes under standard hybridization 
conditions to a nucleic acid having the nucleotide sequence 

AGCTGGGGGGGAGGGGGGA (SEQ ID NO:40) or a sequence complementary to 
SEQ ID NO:40; wherein said isolated nucleic acid consists of 12 to 48 nucleotides. 

40. An isolated nucleic acid that hybridizes to the nucleotide sequence of SEQ ID 
NO:42, but not to the nucleotide sequence of SEQ ID NO:41; when said hybridizing 
is performed under identical conditions. 
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1 41. An isolated nucleic acid that hybridizes to the complementary strand of the 

2 nucleotide sequence of SEQ ID NO;42, but not to the complementary strand of the 

3 nucleotide sequence of SEQ ID N0:41 ; when said hybridizing is performed under 

4 identical conditions. 



1 42. An isolated nucleic acid that hybridizes to the nucleotide sequence of SEQ ID 

2 NO:41, but not to the nucleotide sequence of SEQ ID NO:42; when said hybridizing 

3 is performed under identical conditions. 

143. An isolated nucleic acid that hybridizes to the complementary strand of the 

2 nucleotide sequence of SEQ ID N0:4 1 , but not to the complementary strand of the 

3 nucleotide sequence of SEQ ID NO:42; when said hybridizing is performed under 

4 identical conditions. 



1 44. The method of Claim 5 wherein said analyzing the nucleic acids and/or 

2 proteins from the biological sample comprises determining if the biological sample 

3 contains a genetic variant of human dihydrofolate reductase having a nucleotide 

4 sequence with a 1 9 base-pair deletion spanning nucleotides 540 to 55 8 of the 

5 nucleotide sequence of SEQ ID N0:41 ; and wherein the genetic variant of human 

6 dihydrofolate reductase is an explanatory variable. 

1 45. The method of Claim 44 wherein said determining is performed by a method 

2 selected from the group consisting of PCR, special PCR, RT PGR, RFLP analysis, 

3 SSCP, and FISH. 



1 46. 

2 



The method of Claim 1 wherein said analyzing the nucleic acids and/or 



proteins from the biological sample comprises determining if the biological sample 

3 contains the genetic variant of human dihydrofolate reductase having a nucleotide 

4 sequence with a 19 base-pair deletion spanning nucleotides 540 to 558 of the 

5 nucleotide sequence of SEQ ID NO:41 ; and wherein the genetic variant of human 

6 dihydrofolate reductase is an explanatory variable. 
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1 47. The method of Claim 46 wherein said determining is performed by a method 

2 selected from the group consisting of PGR, special PGR, RT PGR, RFLP analysis, 

3 SSGP, and FISH. 
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Abstract 



The present invention discloses a novel method for identifying an individual who may 
be susceptible to develop a developmental disorder. In one particular example, an 
individual is identified who is genetically susceptible to becoming schizophrenic. In 
addition, the present invention discloses a novel method for identifying individuals 
who are genetically susceptible to have offspring with a developmental disorder. 
Methods of diagnosing, preventing and treating developmental disorders such as 
schizophrenia are also provided. 
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Primers for PGR Anplif ication the 
Forward primer: S'-CTA 
Reverse primer: 5 '-AAA 



IMFR Deletion Polymorphism Pegion 
AAC TGC Arc GIC GOT GrG-3' 
AGG GGA Arc CAG TCG G-3' 
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Genotypes of the DHFR 19 bp Deletion 
by Non-denaturing Polyacrylamide Gel Efectrophoresis 
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Sequences of PGR Anplif ication Prociucts 
in the Region of the DHFR Deletion PolymoriAiism Region 
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GCTGCCCACGGTCJGGGGr GG<XGACKXXX3GCGAGA 



Figure 3 
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1 CTGCAGCGCC AGGGTCCACC TGGTCGGCTG CACCTGTGGA GGAGGAGGTG 
51 GATTTCAGGC TTCCCGTAGA CTGGAAGAAT CGGCTCAAAA CCGCTTGCCT 
101 CGCAGGGGCT GAGCTGGAGG CAGCGAGGCC GCCCGACGCA GGCTTCCGGC 
151 GAGACATGGC AGGGCAAGGA TGGCAGCCCG GCGGCAGGGC CCGGCGAGGA 
201 GCGCGAACCC GCGGCCGCAG TTCCCAGGCG TCTGCGGGCG CGAGCACGCC 
251 GCGACCCTGC GTGCGCCGGG GCGGGGGGGC GGGGCCTCGC CTGCACAAAT 
301 AGGGACGAGG GGGCGGGGCG GCCACAATTT CGCGCCAAAC TTGACCGCGC 
351 GTTCTGCTGT AACGAGCGGG CTCGGAGGTC CTCCCGCTGC TGTCATGGTT 
401 GGTTCGCTAA ACTGCATCGT CGCTGTGTCC CAGAACATGG GCATCGGCAA 
451 GAACGGGGAC CTGCCCTGGC CACCGCTCAG GTATCTGCCG GGGCGGGGCG 
501 ATGGGACCCA AACGGGCGCA GGCTGCCCAC GGTCGGGGTA CCTGGGCGGG 
551 ACGCGCCAGG CCGACTCCCG GCGAGAGGAT GGGGCCAGAC TTGCGGTCTG 
601 CGCTGGCAGG AAGGGTGGGC CCGACTGGAT TCCCCTTTTC TGCTGCGCGG 
651 GAGGCCCAGT TGCTGATTTC TGCCCGGATT CTGCTGCCCG GTGAGGTCTT 
7 01 TGCCCTGCGG CGCCCTCGCC CAGGGCAAAG TCCCAGCCCT GGAGAAAACA 

7 51 CCTCACCCCT ACCCACAGCG CTCCGTTTGT CAGGTGCCTT AGAGCTCGAG 

8 01 CCCAAGGGAT AATGTTTCGA GTAACGCTGT TTCTCTAACT TGTAGGAATG 
851 AATTCAGATA TTTCCAGAGA ATGACCACAA CCTCTTCAGT AGAAGGTAAT 
901 GTGGGATTAA GTAGGGTCTT GCTTGATGAA GTTTACCAGT GCAAATGTTA 
951 GTTAAATGGA AAGTTTTCCG TGTTAATCTG GGACCTTTTC TCTTATTATG 

1001 GATCTGTATG ATCTGTATGC AGTTCCCAAG GTTCATTTAC CATTATTAAA 
1051 AAATTTTTGT CTTAGAAATT TTATGTATGT CAACGCACGA GCAAATTATC 
1101 AGGCATGGGG CAGAATTGGC AACTGGGTGG AGGCTTCGGT GGAGGTTAGC 
1151 ACTCCGAAAG GAAAACAGAG TAGGCCTTTG GAACAGCTGC TGGAAGAGAT 
12 01 AAGGCCTGAA CAAGGGCAGT GGAGAAGAGA GGGTAAAAAT TTTTTAAGGT 
1251 TACATGACCC TGGATTTTGG AGATC 



Figure 4A 
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1 CTGCAGCGCC AGGGTCCACC TGGTCGGCTG CACCTGTGGA GGAGGAGGTG 
51 GATTTCAGGC TTCCCGTAGA CTGGAAGAAT CGGCTCAAAA CCGCTTGCCT 
101 CGCAGGGGCT GAGCTGGAGG CAGCGAGGCC GCCCGACGCA GGCTTCCGGC 
151 GAGACATGGC AGGGCAAGGA TGGCAGCCCG GCGGCAGGGC CCGGCGAGGA 
2 01 GCGCGAACCC GCGGCCGCAG TTCCCAGGCG TCTGCGGGCG CGAGCACGCC 
251 GCGACCCTGC GTGCGCCGGG GCGGGGGGGC GGGGCCTCGC CTGCACAAAT 
301 AGGGACGAGG GGGCGGGGCG GCCACAATTT CGCGCCAAAC TTGACCGCGC 
351 GTTCTGCTGT AACGAGCGGG CTCGGAGGTC CTCCCGCTGC TGTCATGGTT 
401 GGTTCGCTAA ACTGCATCGT CGGTGTGTCC CAGAACATGG GCATCGGCAA 
451 GAACGGGGAC CTGCCCTGGC CACCGCTCAG GTATCTGCCG GGGCGGGGCG 
501 ATGGGACCCA AACGGGCGCA GGCTGCCCAC GGTCGGGGT 
551 GG CCGACTCCCG GCGAGAGGAT GGGGCCAGAC TTGCGGTCTG 

601 CGCTGGCAGG AAGGGTGGGC CCGACTGGAT TCCCCTTTTC TGCTGCGCGG 
651 GAGGCCCAGT TGCTGATTTC TGCCCGGATT CTGCTGCCCG GTGAGGTCTT 

7 01 TGCCCTGCGG CGCCCTCGCC CAGGGCAAAG TCCCAGCCCT GGAGAAAACA 
751 CCTCACCCCT ACCCACAGCG CTCCGTTTGT- .CAGGTGCCTT AGAGCTCGAG 

8 01 CCCAAGGGAT AATGTTTCGA -GTAACGCTGT TTCTCTAACT TGTAGGAATG 
8 51 AATTCAGATA TTTCCAGAGA ATGACCACAA CCTCTTCAGT AGAAGGTAAT 
901 GTGGGATTAA GTAGGGTCTT GCTTGATGAA GTTTACCAGT GCAAATGTTA 
951 GTTAAATGGA AAGTTTTCCG TGTTAATCTG GGACCTTTTC TCTTATTATG 

1001 GATCTGTATG ATCTGTATGC AGTTCCCAAG GTTCATTTAC CATTATTAAA 
1051 AAATTTTTGT CTTAGAAATT TTATGTATGT CAACGCACGA GCAAATTATC 
1101 AGGCATGGGG CAGAATTGGC AACTGGGTGG AGGCTTCGGT GGAGGTTAGC 
1151 ACTCCGAAAG GAAAACAGAG TAGGCCTTTG GAACAGCTGC TGGAAGAGAT 
12 01 AAGGCCTGAA CAAGGGCAGT GGAGAAGAGA GGGTAAAAAT TTTTTAAGGT 
1251 TACATGACCC TGGATTTTGG AGATC 



Figure 4B 
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<120> METHODS FOR DIAGNOSING, PREVENTING, AND TREATING 
DEVELOPMENTAL DISORDERS 

<130> 601-1-057N 

<140> UNASSIGNED 
<141> 2000-05-23 



<150> 60/136,198 
<151> 1999-05-25 



<170> PatentIn Ver. 2.0 

<210> 1 

<211> 2187 

<212> DNA 

<213> Homo sapiens 

<400> 1 

gccatggtga acgaagccag aggaaacagc agcctcaacc cctgcttgga gggcagtgcc 60 
agcagtggca gtgagagctc caaagatagt tcgagatgtt ccaccccggg cctggaccct 120 
gagcggcatg agagactccg ggagaagatg aggcggcgat tggaatctgg tgacaagtgg 180 
ttctccctgg aattcttccc tcctcgaact gctgagggag ctgtcaatct catctcaagg 240 
tttgaccgga tggcagcagg tggccccctc tacatagacg tgacctggca cccagcaggt 3 00 
gaccctggct cagacaagga gacctcctcc atgatgatcg ccagcaccgc cgtgaactac 3 60 
tgtggcctgg agaccatcct gcacatgacc tgctgccgtc agcgcctgga ggagatcacg 420 
ggccatctgc acaaagctaa gcagctgggc ctgaagaaca tcatggcgct gcggggagac 480 
ccaataggtg accagtggga agaggaggag ggaggcttca actacgcagt ggacctggtg 540 
aagcacatcc gaagtgagtt tggtgactac tttgacatct gtgtggcagg ttaccccaaa 6 00 
ggccaccccg aagcagggag ctttgaggct gacctgaagc acttgaagga gaaggtgtct 6 60 
gcgggagccg atttcatcat cacgcagctt ttctttgagg ctgacacatt cttccgcttt 72 0 
gtgaaggcat gcaccgacat gggcatcact tgccccatcg tccccgggat ctttcccatc 780 
cagggctacc actcccttcg gcagcttgtg aagctgtcca agctggaggt gccacaggag 840 
atcaaggacg tgattgagcc aatcaaagac aacgatgctg ccatccgcaa ctatggcatc 9 00 
gagctggccg tgagcctgtg ccaggagctt ctggccagtg gcttggtgcc aggcctccac 9 60 
ttctacaccc tcaaccgcga gatggctacc acagaggtgc tgaagcgcct ggggatgtgg 1020 
actgaggacc ccaggcgtcc cctaccctgg gctctcagtg cccaccccaa gcgccgagag 1080 
gaagatgtac gtcccatctt ctgggcctcc agaccaaaga gttacatcta ccgtacccag 1140 
gagtgggacg agttccctaa cggccgctgg ggcaattcct cttcccctgc ctttggggag 12 00 
ctgaaggact actacctctt ctacctgaag agcaagtccc ccaaggagga gctgctgaag 12 60 
atgtgggggg aggagctgac cagtgaagca agtgtctttg aagtctttgt tctttacctc 1320 
tcgggagaac caaaccggaa tggtcacaaa gtgacttgcc tgccctggaa cgatgagccc 1380 
ctggcggctg agaccagcct gctgaaggag gagctgctgc gggtgaaccg ccagggcatc 1440 
ctcaccatca actcacagcc caacatcaac gggaagccgt cctccgaccc catcgtgggc 1500 
tggggcccca gcgggggcta tgtcttccag aaggcctact tagagttttt cacttcccgc 1560 
gagacagcgg aagcacttct gcaagtgctg aagaagtacg agctccgggt taattaccac 1520 
cttgtcaatg tgaagggtga aaacatcacc aatgcccctg aactgcagcc gaatgctgtc 1580 
acttggggca tcttccctgg gcgagagatc atccagccca ccgtagtgga tcccgtcagc 1740 
ttcatgttct ggaaggacga ggcctttgcc ctgtggattg agcggtgggg aaagctgtat 1800 
gaggaggagt ccccgtcccg caccatcatc cagtacatcc acgacaacta cttcctggtc 1860 



aacctggtgg acaatgactt cccactggac 
ttggagcttc tcaacaggcc cacccagaat 
gtcctgacgc cctgcgttgg agccactcct 
ctcttgggaa ctccactctc cttcgtgtct 
acaatggcag ctagactgga gtgaggcttc 
catgggaacc tagtactctc tgctcta 

<210> 2 

<211> 7122 

<212> DNA 

<213> Homo sapiens 



aactgcctct ggcaggtggt ggaagacaca 192 0 
gcgagagaaa cggaggctcc atgaccctgc 198 0 
gtcccgcctt cctcctccac agtgctgctt 2040 
ctcccacccc ggcctccact cccccacctg 2100 
caggctcttc ctggacctga gtcggcccca 2160 
2187 



<400> 2 

gcgcgtgtct ggctgctagg ccgacaccaa 
gtgctccagc agttgccgcg cccagccccg 
gggtccgcag tccccccgcg acgcgagcca 
gtggcaggct cgcctggcgc tggctggcgt 
acgtcttctc tgccgcgccc tctgcgcaag 
ccaagacctg tcgcaacccg aaggtctgaa 
tctgcagaag aggattatgg tgctggatgg 
gctaaacgaa gaacacttcc gaggtcagga 
caacaatgac attttaagta taactcagcc 
cttgctggct ggggcagata tcattgaaac 
agctgactat ggccttgaac acttggccta 
cagaaaagct gccgaggagg taactctcca 
tctgggtccg actaataaga cactctctgt 
gaacatcaca tttgatgagc ttgttgaagc 
tggcggggtt gatatcttac tcattgaaac 
cttgtttgca ctccaaaatc tttttgagga 
agggacgatc gttgataaaa gtgggcggac 
catcagcgtg tctcatggag aaccactcta 
tgaaatgaga ccttttattg aaataattgg 
tcccaatgca ggtcttccca acacctttgg 
caagcaccta aaggattttg ctatggatgg 
gtcaacacca gatcatatca gggaaattgc 
tccacctgcc actgcttttg aaggacatat 
tggaccgtac accaactttg ttaacattgg 
gtttgctaaa ctcatcatgg caggaaacta 
ggtggaaatg ggagcccagg tgttggatgt 
aagtgcaatg accagatttt gcaacttaat 
tttgtgcatc gactcctcca attttgctgt 
gaagtgcatt gtcaatagca ttagtctgaa 
caggaagatt aaaaagtatg gagctgctat 
ggcaacagaa acagacacaa aaatcagagt 
aaaactgggc tttaatccaa atgacattat 
tggaatggag gaacacaact tgtatgccat 
agaaacatta cctggagcca gaataagtgg 
aggaatggaa gccattcgag aagcaatgca 
tggcatggac atggggatag tgaatgctgg 
ggaacttctg cagctctgtg aagatctcat 
gctcttacgt tatgcccaga ctcaaggcac 
gtggagaaat ggccctgtcg aagaacgcct 
acatattatt gaggatactg aggaagccag 
caatataatt gaaggacccc tgatgaatgg 
tggaaaaatg tttctacctc aggttataaa 
ccaccttatc cctttcatgg aaaaagaaag 
agaagaagag gacccttacc agggcaccat 



ggactggccg ggtacccggg aagaaagcac 60 
agagaggccc tagggcgctg cgggctttcg 12 0 
acgggaggcg tcaaaagacc cgggccttgt 180 
ggcccttggc cgtcgtcacc tgtggagagc 240 
gaggagactc gacaacatgt cacccgcgct 3 00 
gaaaaccctg cgggatgaga tcaatgccat 3 60 
agggatgggg accatgatcc agcgggagaa 420 
atttaaagat catgccaggc cgctgaaagg 480 
tgatgtcatt taccaaatcc ataaggaata 540 
aaatactttt agcagcacta gtattgccca 600 
ccggatgaac atgtgctctg caggagtggc 6 60 
gacaggaatt aagaggtttg tggcaggggc 720 
gtccccatct gtggaaaggc cggattatag 780 
ataccaagag caggccaaag gacttctgga 840 
tatttttgat actgccaatg ccaaggcagc 9 00 
gaaatatgct ccccggccta tctttatttc 960 
tctttccgga cagacaggag agggatttgt 1020 
cattggatta aattgtgctt tgggtgcagc 1080 
aaaatgtaca acagcctatg tcctctgtta 1140 
tgactatgat gaaacgcctt ctatgatggc 120 0 
cttggtcaat atagttggag gatgctgtgg 12 6 0 
tgaagctgtg aaaaattgta agcctagagt 132 0 
gttactgtct ggtctagagc ccttcaggat 1380 
agagcgctgt aatgttgcag gatcaaggaa 1440 
tgaagaagcc ttgtgtgttg ccaaagtgca 1500 
caacatggat gatggcatgc tagatggtcc 1560 
tgcttccgag ccagacatcg caaaggtacc 162 0 
gattgaagct gggttaaagt gctgccaagg 168 0 
ggaaggagag gacgacttct tggagaaggc 1740 
ggtggtcatg gcttttgatg aagaaggaca 1800 
gtgcacccgg gcctaccatc tgcttgtgaa 1860 
ttttgaccct aatatcctaa ccattgggac 1920 
taattttatc catgcaacaa aagtcattaa 1980 
aggtctttcc aacttgtcct tctccttccg 2040 
tggggttttc ctttaccatg caatcaagtc 2100 
aaacctccct gtgtatgatg atatccataa 2160 
ctggaataaa gaccctgagg ccactgagaa 222 0 
aggagggaag aaagtcattc agactgatga 2280 
tgagtatgcc cttgtgaagg gcattgaaaa 2340 
gttaaaccaa aaaaaatatc cccgacctct 2400 
aatgaaaatt gttggtgatc tttttggagc 2460 
gtcagcccgg gttatgaaga aggctgttgg 2 520 
agaagaaacc agagtgctta acggcacagt 2580 
cgtgctggcc actgttaaag gcgacgtgca 2 640 



cgacataggc aagaacatag ttggagtagt ccttggctgc aataatttcc gagttattga 2700 
tttaggagtc atgactccat gtgataagat actgaaagct gctcttgacc acaaagcaga 2760 
tataattggc ctgtcaggac tcatcactcc ttccctggat gaaatgattt ttgttgccaa 2820 
ggaaatggag agattagcta taaggattcc attgttgatt ggaggagcaa ccacttcaaa 2 880 
aacccacaca gcagttaaaa tagctccgag atacagtgca cctgtaatcc atgtcctgga 2940 
cgcgtccaag agtgtggtgg tgtgttccca gctgttagat gaaaatctaa aggatgaata 3 0 00 
ctttgaggaa atcatggaag aatatgaaga tattagacag gaccattatg agtctctcaa 3 0 50 
ggagaggaga tacttaccct taagtcaagc cagaaaaagt ggtttccaaa tggattggct 3120 
gtctgaacct cacccagtga agcccacgtt tattgggacc caggtctttg aagactatga 3180 
cctgcagaag ctggtggact acattgactg gaagcctttc tttgatgtct ggcagctccg 3240 
gggcaagtac ccgaatcgag gctttcccaa gatatttaac gacaaaacag taggtggaga 33 00 
ggccaggaag gtctacgatg atgcccacaa tatgctgaac acactgatta gtcaaaagaa 33 50 
actccgggcc cggggtgtgg ttgggttctg gccagcacag agtatccaag acgacattca 3420 
cctgtacgca gaggctgctg tgccccaggc tgcagagccc atagccacct tctatgggtt 3480 
aaggcaacag gctgagaagg actctgccag cacggagcca tactactgcc tctcagactt 3540 
catcgctccc ttgcattctg gcatccgtga ctacctgggc ctgtttgccg ttgcctgctt 3 600 
tggggtagaa gagctgagca aggcctatga ggatgatggt gacgactaca gcagcatcat 3 6 50 
ggtcaaggcg ctgggggacc ggctggcaga ggcctttgca gaagagctcc atgaaagagt 3 720 
tcgccgagaa ctgtgggcct actgtggcag tgagcagctg gacgtcgcag acctgcgcag 3780 
gctgcggtac aagggcatcc gcccggctcc tggctacccc agccagcccg accacaccga 3 840 
gaagctcacc atgtggagac tcgcagacat cgagcagtct acaggcatta ggttaacaga 3900 
atcattagca atggcacctg cttcagcagt ctcaggcctc tacttctcca atttgaagtc 3960 
caaatatttt gctgtgggga agatttccaa ggatcaggtt gaggattatg cattgaggaa 4 02 0 
gaacatatct gtggctgagg ttgagaaatg gcttggaccc attttgggat atgatacaga 4080 
ctaacttttt ttttttttgc cttttttatt cttgatgatc ctcaaggaaa tacaacctag 4140 
ggtgccttaa aaataacaac aacaaaaaac ctgtgtgcat ctggctgaca cttacctgct 4200 
tctggttttc gaagactatt tagtggaacc ttgtagagga gcagggtctt cctgcagtgc 42 60 
ctggaaaaca ggcgctgttt ttttgggacc ttgcgtgaag agcagtgagc agggttcctg 4320 
tggtttccct ggtccctctg agatggggac agactgaaga cagaggtcgt ttgatttcaa 4380 
agcaagtcaa cctgcttttt tctgttttta cagtggaatc taggaggcca cttagtcgtc 4440 
tttttttcct cttagaagaa aagcctgaaa ctgagttgaa tagagaagtg tgaccctgtg 45 00 
acaaaatgat actgtgaaaa atggggcatt ttaatctaag tggttataac agtggattct 45 60 
gacggggaag gtgtagctct gttctcttcg gaagacctcg ttttctaaag gctggactaa 4620 
atggctgcag aactcccttt ggcaaaaggc atgcgctcac tgcttgcttg tcagaaacac 4680 
tgaagccatt tgccccagtg tggtcaagca gccatgcttt ctgggcattt tcgtcctccc 4740 
ataatttcat atttccgtac ccctgaggaa acaaaaagga aatgaggaga gaaagttact 48 00 
gttaagggtg gttaacattt tttttgtttt gttttgtttt ggtttttttt ttttgagaca 4850 
gagtctggct ctgtcgccca ggctggagtg caggggcgca atctcggctc atagcaagct 4920 
ccgcctcctg ggttcatgcc attctcctgc ctcagcctcc agagtagctg ggactacagg 4980 
tgcccaccac cacacccggc taattttttg tgtttttaca aaatacaaaa aagtagagac 5040 
aggatttcac tgtgttagcc aggatggtct tgatctcccg acctcgtgat ctgcccacct 5100 
cagcctccca aaatgctggg attacaggcg tgagccaccg agcctggccg gttaacatct 5160 
tttaattgtt tccaggattg agcaggttct cagctgggct ctgatatccc gtgcggagtt 5220 
ggacaagtgg gcagcataaa gtcactcatt tcttaccatt ttattcccct caattctcaa 5280 
tatattcagt aatgaagaat ggtgccacca ctcaagcaac aagcctcaaa ctcaaccatg 5340 
tcatcttttt cttggatgat tgcagttatt tcaaaaattt gcatgcaaaa tatacactca 5400 
tcctacttca agatggtggt ggcaatagtc aggagaaggt aacattggag tcctggtttg 5460 
attcgaagga tgaagacgaa gaagcaaggg aggaacaaat gaagaaccat ctttgttcat 5520 
gaataggaat attcaagatt ataaaggtat caggtctcct aaaattgatc tatggattta 5580 
ataccatttt caatggaaat tccaacagat tttattgaat gaaacaagca ggtgtttata 5640 
tggagtagca aaggacttaa aattaccaaa tgcttctaaa tatgaaggag aggttgggga 57 00 
cacgcaccct atgtgatacc aagttttatt gtcaagacag tgtcatggtg cagaggtagg 57 50 
cattctgagc aggggaacaa aataagggcc tagaaactca cccgtgcata tgttgacctt 5820 
tgcaaaatga cctggtgaca tggcaagtca gtggggacag gaaggaccac tccctaagta 58 80 
atcccagaac aatggctatt catgtgggaa aaaaagaaat tttactttct ctcaccttac 5940 
ctggtgataa gttccaaata tgttaagggc tttaatacaa aaagcaaaaa ttgtcagtgt 6000 
ttggatgaaa aaagccttag ggcaggaaag aatctcttga gacataaagt agtaatcata 6060 



aaggacaaga tggttaagtc aattctgtta 
gaagtgagaa gatgatccac aacttgagaa 
attcataatc acaaatatag agaattccta 
cacaagagga aatagggctt ttaaataaat 
tgaattagga ccacaatgag attccatttt 
gacagtacca gttgttagat ctgtagggac 
accactgctt taaaaaacaa ttatccctta 
ttgcttccaa ctcccacctg tatgtccagc 
gtgtaagaat gttcatagtt acatatttat 
atgtctgtct acaggaaaat aggtgaataa 
tattcagtag tggaaatgag tgaactacag 
aaatattaag gaaaaaagca agtttgaaga 
cccaaaaaca agcaaaacca aagaatatgt 
gctattaaaa aaaaaaggta actgataaac 
ggaggaagtg gaagggacag gagcacatag 
tcctcctggt aggcttacaa gtgtttacta 
aatagataac agttttttac atattaaata 
aggcaaagtg gaatgtttaa aaaaaaaaaa 



aaactcaagg cttatattaa gcaaacactt 6120 
gacatttata atacaaataa ctgatgaagg 6180 
tttaaaaaaa tagaaaaata gtgaagacta 6240 
agatgttctg tagcattggt cagggaaata 6300 
atatccataa gatttgcaaa ggttgggtct 63 60 
ttgtacaaca ttgtggatgt gtaaacaggc 6420 
cagacttgaa catttgcaga cgttatgatc 6480 
aaactcttgc atgtggccac taggaggaat 6540 
aatagttaat aactggaaaa agtgaaatgt 6 600 
ttagatatat atattcattc tacgggatat 6 660 
ctatacctca caataagaat gaatctcaga 672 0 
gaccacatgg ggcgtactat ttttattggg 6780 
agtctaagca tacgtataca ataaaactat 6840 
caaaattgag catagtaatt acccacagaa 6900 
gtagatgcca agttatgcag ctgttctggt 6960 
tatgctatta atacattata ctttataact 7020 
tgttctactt aaatatatta taaaaaataa 7080 
aaaaaaaaaa aa 7122 



<210> 3 

<211> 564 

<212> DNA 

<213> Homo sapiens 



<400> 3 

atggttggtt cgctaaactg catcgtcgct 
ggggacctgc cctggccacc gctcaggaat 
acctcttcag tagaaggtaa acagaatctg 
attcctgaga agaatcgacc tttaaagggt 
aaggaacctc cacaaggagc tcattttctt 
actgaacaac cagaattagc aaataaagta 
gtttataagg aagccatgaa tcacccaggc 
caagactttg aaagtgacac gttttttcca 
ccagaatacc caggtgttct ctctgatgtc 
gaagtatatg agaagaatga ttaa 



gtgtcccaga acatgggcat cggcaagaac 6 0 
gaattcagat atttccagag aatgaccaca 12 0 
gtgattatgg gtaagaagac ctggttctcc 180 
agaattaatt tagttctcag cagagaactc 240 
tccagaagtc tagatgatgc cttaaaactt 3 00 
gacatggtct ggatagttgg tggcagttct 3 50 
catcttaaac tatttgtgac aaggatcatg 420 
gaaattgatt tggagaaata taaacttctg 480 
caggaggaga aaggcattaa gtacaaattt 540 
564 



<210> 4 

<211> 2158 

<212> DNA 

<213> Homo sapiens 



<400> 4 

gcgcggcata acgacccagg tcgcggcgcg 
ggagccgagc atggagtacc aggatgccgt 
cggctacctg gagcaggtga agcgccagcg 
ggaactgtac ctggcacgga gtgggctgca 
ccacgtcact gggacgaagg ggaagggctc 
aagctatggc ctgaagacgg gattctttag 
gatccgcatc aatgggcagc ccatcagtcc 
ctaccaccgg ctggaggaga ccaaggatgg 
cttcctgaca ctcatggcct tccacgtctt 
ggaggtgggc attggcgggg cttatgactg 
cggagtctcc tctcttggca tcgaccacac 
cgcatggcag aaagggggca tctttaagca 
tgaaggtccc ctggcagtgc tgagggaccg 
gtgtccgatg ctggaggccc tcgaggaagg 
ggagcaccag cggtccaacg ccgccttggc 



gcggggcttg agcgcgtggc cggtgccgca 60 
gcgcatgctc aataccctgc agaccaatgc 12 0 
gggtgaccct cagacacagt tggaagccat 180 
ggtggaggac ttggaccggc tgaacatcat 240 
cacctgtgcc ttcacggaat gtatcctccg 300 
ctctccccac ctggtgcagg ttcgggagcg 360 
tgagctcttc accaagtact tctggcgcct 42 0 
cagctgtgtc tccatgcccc cctacttccg 480 
cctccaagag aaggtggacc tggcagtggt 540 
caccaacatc atcaggaagc ctgtggtgtg 600 
cagcctcctg ggggatacgg tggagaagat 660 
aggtgtccct gccttcactg tgctccaacc 72 0 
agcccagcag atctcatgtc ctctatacct 780 
ggggccgccg ctgaccctgg gcctggaggg 840 
cttgcagctg gcccactgct ggctgcagcg 900 



gcaggaccgc catggtgctg gggagccaaa ggcatccagg ccagggctcc tgtggcagct 960 
gcccctggca cctgtgttcc agcccacatc ccacatgcgg ctcgggcttc ggaacacgga 102 0 
gtggccgggc cggacgcagg tgctgcggcg cgggcccctc acctggtacc tggacggtgc 1080 
gcacaccgcc agcagcgcgc aggcctgcgt gcgctggttc cgccaggcgc tgcagggccg 1140 
cgagaggccg agcggtggcc ccgaggttcg agtcttgctc ttcaatgcta ccggggaccg 1200 
ggacccggcg gccctgctga agctgctgca gccctgccag tttgactatg ccgtcttctg 1260 
ccctaacctg acagaggtgt catccacagg caacgcagac caacagaact tcacagtgac 132 0 
actggaccag gtcctgctcc gctgcctgga acaccagcag cactggaacc acctggacga 1380 
agagcaggcc agcccggacc tctggagtgc ccccagccca gagcccggtg ggtccgcatc 1440 
cctgcttctg gcgccccacc caccccacac ctgcagtgcc agctccctcg tcttcagctg 1500 
catttcacat gccttgcaat ggatcagcca aggccgagac cccatcttcc agccacctag 1560 
tcccccaaag ggcctcctca cccaccctgt ggctcacagt ggggccagca tactccgtga 162 0 
ggctgctgcc atccatgtgc tagtcactgg cagcctgcac ctggtgggtg gtgtcctgaa 168 0 
gctgctggag cccgcactgt cccagtagcc aaggcccggg gttggaggtg ggagcttccc 1740 
acacctgcct gcgttctccc catgaactta catactaggt gccttttgtt tttggctttc 1800 
ctggttctgt ctagactggc ctaggggcca gggctttggg atgggaggcc gggagaggat 1860 
gtctttttta aggctctgtg ccttggtctc tccttcctct tggctgagat agcagagggg 1920 
ctccccgggt ctctcactgt tgcagtggcc tggccgttca gcctgtctcc cccaacaccc 1980 
cgcctgcctc ctggctcagg cccagcttat tgtgtgcgct gcctggccag gccctgggtc 2 040 
ttgccatgtg ctgggtggta gatttcctcc tcccagtgcc ttctgggaag ggagagggcc 2100 
tctgcctggg acactgcggg acagagggtg gctggagtga attaaagcct ttgttttt 2158 

<210> 5 

<211> 7720 

<212> DNA 

<213> Homo sapiens 

<400> 5 

taagttgaca cttctcaggt tgtcacaaga ttcaggtatg gctcactgtt gcaggacata 6 0 
agctgggatc tcctgggaat tggtctgctt gcaggcccta gagagccttc cttcttggtt 120 
gattttcctc tagagatcca actgtcttct caggctcccc tgcctgcctc ctccttgggt 180 
cctttcttgt ggcattgcca gattactggg cccccatttt ccctacactt actgccactc 240 
atagtctgat ggttcccaca tctgcatcca acctggactc ttcccctgag ctttcccctc 3 00 
tacaaccacc ttccccgggc caagggcaca caggcacctc gacaaaacag tgttctatgt 3 60 
ttcttcctgc ccaaacctgc ccctccctct cccttttccc atctgtggta ccaccatggg 420 
ctcagagaat aaaaaaaatg aaggcttctg tcattgactg gggtggagat ggagggaaga 480 
gttagcccag aatcacaggt gctgtagaaa ggatacctga gttgccggga gagggggtcc 540 
atgagttggg gatggaagga gagcttggcc cttcaaacaa ttgaagatct gatcaaaaga 600 
ttcagaacat ctgtgatttt gtggctggtg atgggtgaca cctgggctaa tggggttggg 660 
ggagttggtg gctctacaat ttatggcctt gggagatcct tgctctctat agctgactgg 720 
gaggttggaa gcctgggctc tagcccttgc cttgatcctc cggatctcat tttcctcatc 780 
tgcctaacag gacagagggg ttggaaactg atgagattag ctcaaaggat cctggcagct 840 
caggctgcaa gatttttttc agacctcagt gtttgggaaa aaattgggta ggtggagctt 900 
agggactggc cttaggcctg cactgttaat tcaccccctc ccactacccc atggaggcct 9 60 
ggctggtgct cacatacaat aattaactgc tgagtggcct tcgcccaatc ccaggctcca 102 0 
ctcctgggct ccattcccac tccctgcctg tctcctaggc cactaaacca cagctgtccc 1080 
ctggaataag gcaaggggga gtgtagagca gagcagaagc ctgagccaga cggagagcca 1140 
cctcctctcc caggtatgtg acactcccca tcccccttca gaggccacac accctatggc 1200 
attcccacca tgtgttaagg attttctgaa ctggaagggc cctctgtttg cctgaaggcc 1260 
agagaatctt gaagtggaga ctgaggccca gaccagagtg tggcctgctc aagattaaac 132 0 
gacaagttag tgttcatccc cctgaactag tacctgggct ctagcccttc agtccagagc 1380 
tgagttctca gctcttctag tctggggccc caaggttggg tgtgggggtc atgattgttg 1440 
gtggggaggg gtcacagctg gactaagacc tgaaggtgag actaggcagg tgggaaagga 1500 
gcttgcagag tgatgctgct caaaaggaca ggaagagagc ctggcttcag aagcagccac 1560 
agcaagagag actactgact gaacaggtgg gctccactgg gggctccgga aaggattttc 162 0 
tcagccccca tccccagcac tgtgtgttgg ccgcacccat gagagcctca gcactctgaa 1680 
ggtgcagggg gcaaaggcca aaagagctct ggcctgaact tgggtggtcc ctactgtgtg 1740 



acttggggca tggccctcat ctgtgctgaa 
tttgttgatt tcccccttct tacatttaat 
gtttgcttct ctttccccca aggccaagga 
ggttgaacgg gaaccctgtg ctctaaacag 
aaaggatcac ctggtattcc ctgagagtac 
tgagtgagca ggtccacagg ggcatgattg 
gagtgaatga acactggaat caatagagta 
gagctgctgg gtgggaattc aattccaggc 
gagacagccc agctcaggcc ctgcctagac 
ggaggggcag cacgggggca aggcaagctt 
actggtatgt ggaagatgag ttagaggaga 
ctgagcattt taggagggcc cagacacctg 
tcccaaaatt cagtgtctga ggtcaaagga 
agctctgtgg tagggggcac aggagctccc 
ccctgccagc acccatgtcc tgtgacccca 
ggcctactaa actagcttga gtgatgaggc 
gcaaaacaaa ctaacaaaaa ccacactgca 
tttttttttt ttttgagatg gagtctcgct 
atcttggctc actgtaacct ccacctcctg 
cacgtagctg ggactacagg cacacgacac 
agacagggtt tcactatgtt ggccaggctg 
ccacctcagc cttccaaagt gctgggatta 
ttgtaaactt ttacaatgaa gtaatttggt 
tttatgtata gttttaattt atcccactag 
gattattggg tatatgaaaa aaatattttc 
attttaaccc tgaaacattt gaataaggca 
tctgcatttt gcccaaatcc atccttgaat 
gaaggcaatg aggctcaagc cagggagggg 
gggaaaactg agggagatgg gggcagggct 
accctcctgg agccctgcac acaacttaag 
accacagctc tttcttcagg gacagacatg 
cttctagtgt gggtggctgt agtaggggag 
gagcttctca atgtctgcat gaacgccaag 
aagttgcatg agcaggtggg ccagggggtg 
gaggaaacga ggacatggaa atgccaaacc 
gcccttcagt ttgcattaat atgggtgact 
acaatgccaa cagttcacct tcttggttgt 
aggcccaaag gagagcctgg gaaatgaagt 
tagggattta gactgggaat gactcctcca 
catagtggcc tcttttctgc cagccctaaa 
tgaggctgtg tgcaaagcat tctttttttt 
ccaggctggt ctcaaattcc tggactcaag 
tgggattaca gaaatgagcc gtacgccctc 
aactttgggc tgtgtctctc gaccacattg 
gctaccaccc ctttaatatc ctgaacatga 
gtaatttgta ggccaggtgt tacggctcac 
gatgggcaga tcacttgagc tcaggagttc 
catctctact aaaaaataaa aaaaattagt 
gctactcagg aggctgaggt gggcaggtca 
gtgccactgc actccagcct gggcaacaga 
aaaaaaagaa agaaaggaag gaaggaaggg 
gaggaaaggg agggaggcaa gggagagaaa 
tgagatagag ttttgctctt gttgcccagg 
tgcaacctcc acctcccagg ttcaagtgat 
cgccaccaca cccagctaat tttttgtttg 
tagagatggg ggtttcacca tgttggccag 
gcccctcttg gcctcccaaa gtgctgagat 



atgattccac aaagattaaa ctggctatca 180 0 
ccttgcagga gaaagctaag cctcaagata 1860 
gaaggtggag tgagggctgg ggtcgggaca 192 0 
ttagggtttg ttcccgcagg aactgaaccc 198 0 
agatttctcc ggcgtggccc tcaaggttag 2 040 
gatcctggaa tgaatgaatc aaccatgaga 210 0 
gcagagtaat ggattgtgga gcaggaaaga 2160 
ttatatgagc cctgctgtgc agtcggcctg 222 0 
ccctgtcaag gaggccctgt caagaggaga 2280 
gtgagcggga aaggcatgtc cactttagcg 2340 
cagatggaga gaagtcatag gaaataaatt 2400 
gtgtccagtg gagtgaagga aacagtcgcc 2460 
ttgaagttct gtgatgacca aggagaagcc 252 0 
caaggcccca gggctgtcca gctggctgtc 2580 
ccccaccaag atcccatggt ttccgggaag 2 640 
tagaaagggg ctgggaccaa ggtttaaaaa 2700 
gcccccccaa ctaaaacatt tttataaact 2760 
ctgtcaccca ggctagagtg caatggcaca 2 82 0 
gattcaagtg attctcctgc ctcagcctcc 2880 
cgcacccagc tcattttgta tttttagtag 2940 
gtctcaaact tctgacctca ggtgatccac 3000 
caggcatgag ccaccgcgcc cagcccattt 3060 
gtcaaaatct gacctgaaaa ttaatgtgag 312 0 
tgtaactgtt tcaccccaga atatacactt 3180 
tttgaatcac ctttgatgaa atcctaaaaa 3240 
ttgtggacct atggcaaact cctggctatt 3300 
tatatcacct gaacctcgtg accacctgga 3360 
tggtgtctaa tcctaccttt cattggatct 342 0 
ctatctgccc caggcttccg tccaggcccc 3480 
gccccacctc cgcattcctt ggtgccactg 3540 
gctcagcgga tgacaacaca gctgctgctc 3 60 0 
gctcagacaa ggattgcatg ggccaggact 3 660 
caccacaagg aaaagccagg ccccgaggac 372 0 
atctggggtg gtgagggact ggctcaggaa 3780 
ccattggcac tggtgaactg aagtggagga 3 840 
tatttcagag acactgtgcc aaatgtcggt 3 90 0 
tgagtttccg cattacagaa ataaggaagc 3 96 0 
tggagtgacc catcctgggg ttgcttgatt 402 0 
aagatctgag ggaagaaact gcacactgtg 408 0 
cagctcaaga agggagagtc tctcacatta 4140 
ttttcctgag acaaagtctc catatgttgc 4200 
tgatcctccc acctcagccc tcccaaagtg 4260 
ctgaagcatc ttggttcatg catctcgcaa 432 0 
gacctgaggt ctccctataa catttatttt 4380 
tgatataact aaagaaaaag cagaggaaaa 4440 
gcctgtaatc ccaacactgt gggatgtcga 4500 
gagaccagcc tgggcaagat ggcaaaaccc 4560 
caggtgtggt ggcacatgcc tgcagtccca 462 0 
gttgagccca ggaggcagag attgtagatc 4680 
gtgagacctt gtcaaaagaa agaaagaacg 4740 
gaggaaggaa agggagggag gaaagggagg 4800 
cttgtaatac gcatttcttt ttttttttct 4860 
gtggatggca gtggcacaat ctcagctcac 492 0 
tctcctgcct cagcctcctg agtaggcaca 4980 
tttgtttgtt ttgtttgttg gtatttttag 5040 
gctggtctcg aactcctcac ctcataatcc 5100 
tacaggtgtg agccactgcg cccggcctta 5160 



agtgcacatt ttatttattt atttatttat ttatttattg agatggagtc ttgctctgtt 5220 
gcccaggctg gagtgcagtg gcacaatctc agctcactgc aacctccacc tcccaggttc 5280 
aagcaattct tctgccttgg cctccagagt agctgggact ataggcacct gccaccatgc 5340 
ctagctaatt tttgtatttt tagtagaaat ggggttttgc catgttggcc aggctggtct 5400 
ccattcttga ccttaagtga tctgtccacc tccacctccc aaagtgctgg gattacaggc 5460 
actatgtgag ccactgtgcc ggcccacatt ttaatattta gcttgtcagc cttaagtaat 552 0 
gagattcagg aagcttgagg ataggcacac aggagcatag tttcaagttg tcctgaattt 55 80 
tgcagccatc acaagttagt ttttaaggaa aaagattagt tcctaagttg tttctcaata 5640 
acttataata aaataacatc cacaattgat tggctataca ttgttttttt gtatcacaaa 5700 
ttccacaaac agataatggg tgaggcagct agtcagggac aaaacacttc ccaagtagct 5760 
gggattacag gtgtccgcca ccacacttgg ctagtttttt gtttgtttat tttttgagat 5820 
ggagtcttgc tctgtcgccc aggctggagt gcagtggcat gatctcggct cactgcaagc 5880 
tccacctgcc gggttcacac cattctcctg cctcagcctc ccaagtagct gggactacag 5940 
gtgccagcca ccacgcccgg ctaatttttt gtatttttag tagagacggg gtttcaccat 6000 
gttggccagg atggtcttga tctcttagcc tcgtgatcca cccgcctcgg cctcccaaaa 6060 
tgctgggatt acaggcgtga gccaccgcac ccggcctaat ttttatattt ttagtagaga 612 0 
cggggtttca ccatgttggc caggctggtc tcaaactctt gatctcaggt gatccacctg 618 0 
ccttggcctc ccaaagtgct gggattacac aagtaagcca ctgcacccag cctggggtta 6240 
caatttaaat tgctttttta ccttcaaatc tttgacacct cagtgaggct taatctgacc 6300 
gcactattac actacaagtc cccatccgtc tctgcttaat ttttgtccaa agcaaaaatc 6360 
aggtgatgtg ttcattgttg taaccccagt ttctacaaaa gtacctgggt gagagtaagt 642 0 
aggatctcaa taaaggttga attaacaaat tttgtaatga ctgcaactcc agcaggagct 648 0 
cccttttggg ctcccactgt ctctgacggc cctctcccct aaagaggtcc caatagcaag 6540 
tattttcctg ggtgacttcc agtgggctgg ggaatcaagg actaagaggg gagacactgc 6600 
atgtggaata ttctggctgt gctggctgtg ctggctgtgg actgagtcct ctgtcttccc 6660 
ccatccagtg tcgaccctgg aggaagaatg cctgctgttc taccaacacc agccaggaag 672 0 
cccataagga tgtttcctac ctatatagat tcaactggaa ccactgtgga gagatggcac 67 8 0 
ctgcctgcaa acggcatttc atccaggaca cctgcctcta cgagtgctcc cccaacttgg 6840 
ggccctggat ccagcaggta tgcatggctt cctgcaggta caagacctag cggagcagct 6 900 
gagctttcca ggcatctctg caggctgcaa ccccagctcc agttctattc ggggctgagt 6960 
tgctgggatt cttgaacctg agcccttctt ttgtatcaaa atcacccagg tggatcagag 7 02 0 
ctggcgcaaa gagcgggtac tgaacgtgcc cctgtgcaaa gaggactgtg agcaatggtg 7080 
ggaagattgt cgcacctcct acacctgcaa gagcaactgg cacaagggct ggaactggac 7140 
ttcaggtgag ggctggggtg ggcaggaatg gagggatttg gaagtggagg tgtgtgggtg 7200 
tggaacaggt atgtgacaat ttggagttgt agggctggca gacctcaaga tagttccggg 72 60 
cccagtggct aaaggtcttc cctcctctct acagggttta acaagtgcgc agtgggagct 732 0 
gcctgccaac ctttccattt ctacttcccc acacccactg ttctgtgcaa tgaaatctgg 7380 
actcactcct acaaggtcag caactacagc cgagggagtg gccgctgcat ccagatgtgg 7440 
ttcgacccag cccagggcaa ccccaatgag gaggtggcga ggttctatgc tgcagccatg 7500 
agtggggctg ggccctgggc agcctggcct ttcctgctta gcctggccct aatgctgctg 7560 
tggctgctca gctgacctcc ttttaccttc tgatacctgg aaatccctgc cctgttcagc 7620 
cccacagctc ccaactattt ggttcctgct ccatggtcgg gcctctgaca gccactttga 7680 
ataaaccaga caccgcacat gtgtcttgag aattatttgg 7720 

<210> 6 

<211> 255 

<212> PRT 

<213> Homo sapiens 

<400> 6 

Met Val Trp Lys Trp Met Pro Leu Leu Leu Leu Leu Val Cys Val Ala 
15 10 15 

Thr Met Cys Ser Ala Gin Asp Arg Thr Asp Leu Leu Asn Val Cys Met 
20 25 30 



Asp Ala Lys His His Lys Thr Lys Pro Gly Pro Glu Asp Lys Leu His 



35 40 45 

Asp Gin Cys Ser Pro Trp Lys Lys Asn Ala Cys Cys Thr Ala Ser Thr 
50 55 60 

Ser Gin Glu Leu His Lys Asp Thr Ser Arg Leu Tyr Asn Phe Asn Trp 
65 70 75 80 

Asp His Cys Gly Lys Met Glu Pro Ala Cys Lys Arg His Phe lie Gin 
85 90 95 

Asp Thr Cys Leu Tyr Glu Cys Ser Pro Asn Leu Gly Pro Trp lie Gin 
100 105 110 

Gin Val Asn Gin Thr Trp Arg Lys Glu Arg Phe Leu Asp Val Pro Leu 
115 120 125 

Cys Lys Glu Asp Cys Gin Arg Trp Trp Glu Asp Cys His Thr Ser His 
130 135 140 

Thr Cys Lys Ser Asn Trp His Arg Gly Trp Asp Trp Thr Ser Gly Val 
145 150 155 160 

Asn Lys Cys Pro Ala Gly Ala Leu Cys Arg Thr Phe Glu Ser Tyr Phe 
165 170 175 

Pro Thr Pro Ala Ala Leu Cys Glu Gly Leu Trp Ser His Ser Tyr Lys 
180 185 190 

Val Ser Asn Tyr Ser Arg Gly Ser Gly Arg Cys lie Gin Met Trp Phe 
195 200 205 

Asp Ser Ala Gin Gly Asn Pro Asn Glu Glu Val Ala Arg Phe Tyr Ala 
210 215 220 

Ala Ala Met His Val Asn Ala Gly Glu Met Leu His Gly Thr Gly Gly 
225 230 235 240 

Leu Leu Leu Ser Leu Ala Leu Met Leu Gin Leu Trp Leu Leu Gly 
245 250 255 



<210> 7 

<211> 817 

<212> DNA 

<213> Homo sapiens 

<400> 7 

cgcaggaata gatggacatg gcctggcaga tgatgcagct gctgcttctg gctttggtga 60 
ctgctgcggg gagtgcccag cccaggagtg cgcgggccag gacggacctg ctcaatgtct 120 
gcatgaacgc caagcaccac aagacacagc ccagccccga ggacgagctg tatggccagt 180 
gcagtccctg gaagaagaat gcctgctgca cggccagcac cagccaggag ctgcacaagg 240 
acacctcccg cctgtacaac tttaactggg atcactgtgg taagatggaa cccacctgca 3 00 
agcgccactt tatccaggac agctgtctct gagtgctcac ccaacctggg gccctggatc 3 60 
cggcaggtca accagagctg gcgcaaagag cgcattctga acgtgcccct gtgcaaagag 420 
gactgtgagc gctggtggga ggactgtcgc acctcctaca cctgcaaaag caactggcac 480 
aaaggctgga attggacctc agggattaat gagtgtccgg ccggggccct ctgcagcacc 540 



tttgagtcct acttccccac tccagccgcc ctttgtgaag gcctctggag ccactccttc 500 
aaggtcagca actatagtcg agggagcggc cgctgcatcc agatgtggtt tgactcagcc 660 
cagggcaacc ccaatgagga ggtggccaag ttctatgctg cggccatgaa tgctggggcc 720 
ccgtctcgtg ggattattga ttcctgatcc aagaagggtc ctctggggtt cttccaacaa 780 
cctattctaa tagacaaatc cacatgaaaa aaaaaaa 817 

<210> 8 

<211> 1669 

<212> DNA 

<213> Homo sapiens 

<400> 8 

gctaggcagc ttcgaaccag tgcaatgacg atgccagtca acggggccca caaggatgct 60 
gacctgtggt cctcacatga caagatgctg gcacaacccc tcaaagacag tgatgttgag 120 
gtttacaaca tcattaagaa ggagagtaac cggcagaggg ttggattgga gctgattgcc 180 
tcggagaatt tcgccagccg agcagttttg gaggccctag gctcttgctt aaataacaaa 240 
tactctgagg ggtacccggg ccagagatac tatggcggga ctgagtttat tgatgaactg 300 
gagaccctct gtcagaagcg agccctgcag gcctataagc tggacccaca gtgctggggg 3 60 
gtcaacgtcc agccctactc aggctcccct gcaaactttg ctgtgtacac tgccctggtg 420 
gaaccccatg ggcgcatcat gggcctggac cttccggatg ggggccacct gacccatggg 480 
ttcatgacag acaagaagaa aatctctgcc acgtccatct tctttgaatc tatgccctac 540 
aaggtgaacc cagatactgg ctacatcaac tatgaccagc tggaggagaa cgcacgcctc 600 
ttccacccga agctgatcat cgcaggaacc agctgctact cccgaaacct ggaatatgcc 660 
cggctacgga agattgcaga tgagaacggg gcgtatctca tggcggacat ggctcacatc 720 
agcgggctgg tggcggctgg cgtggtgccc tccccatttg aacactgcca tgtggtgacc 780 
accaccactc acaagaccct gcgaggctgc cgagctggca tgatcttcta caggaaagga 840 
gtgaaaagtg tggatcccaa gactggcaaa gagattctgt acaacctgga gtctcttatc 900 
aattctgctg tgttccctgg cctgcaggga ggtccccaca accacgccat tgctggggtt 960 
gctgtggcac tgaagcaagc tatgactctg gaatttaaag tttatcaaca ccaggtggtg 102 0 
gccaactgca gggctctgtc tgaggccctg acggagctgg gctacaaaat agtcacaggt 1080 
ggttctgaca accatttgat ccttgtggat ctccgttcca aaggcacaga tggtggaagg 1140 
gctgagaagg tgctagaagc ctgttctatt gcctgcaaca agaacacctg tccaggtgac 1200 
agaagcgctc tgcggcccag tggactgcgg ctggggaccc cagcactgac gtcccgtgga 1260 
cttttggaaa aagacttcca aaaagtagcc cactttattc acagagggat agagctgacc 132 0 
ctgcagatcc agagcgacac tggtgtcaga gccaccctga aagagttcaa ggagagactg 1380 
gcaggggata agtaccaggc ggccgtgcag gctctccggg aggaggttga gagcttcgcc 144 0 
tctctcttcc ctctgcctgg cctgcctgac ttctaaagga gcgggcccac tctggaccca 1500 
cctggcgcca cagaggaagc tgcctgccgg agacccccac ctgagagatg gatgagctgc 1560 
tccaaaggga actgttgaca ctcgggccct ttgagggggt ttcttttgga cttttttcat 1620 
gttttcttca caaatcaaaa tttgtttaag tctcattgtt agtaattct 1669 

<210> 9 

<211> 3112 

<212> DNA 

<213> Homo sapiens 

<400> 9 

gtggaacctc gatattggtg gtgtccatcg tgggcagcgg actaataaag gccatggcgc 60 
cagcagaaat cctgaacggg aaggagatct ccgcgcaaat aagggcgaga ctgaaaaatc 120 
aagtcactca gttgaaggag caagtacctg gtttcacacc acgcctggca atattacagg 180 
ttggcaacag agatgattcc aatctttata taaatgtgaa gctgaaggct gctgaagaga 240 
ttgggatcaa agccactcac attaagttac caagaacaac cacagaatct gaggtgatga 3 00 
agtacattac atctttgaat gaagactcta ctgtacatgg gttcttagtg cagctacctt 3 60 
tagattcaga gaattccatt aacactgaag aagtgatcaa tgctattgca cccgagaagg 420 
atgtggatgg attgactagc atcaatgctg ggagacttgc tagaggtgac ctcaatgact 480 
gtttcattcc ttgtacgcct aagggatgct tggaactcat caaagagaca ggggtgccga 540 
ttgccggaag gcatgctgtg gtggttgggc gcagtaaaat agttggggcc ccgatgcatg 6 00 



acttgcttct gtggaacaat gccacagtga ccacctgcca ctccaagact gcccatctgg 560 
atgaggaggt aaataaaggt gacatcctgg tggttgcaac tggtcagcct gaaatggtta 72 0 
aaggggagtg gatcaaacct ggggcaatag tcatcgactg tggaatcaat tatgtcccag 780 
atgataaaaa accaaatggg agaaaagttg tgggtgatgt ggcatacgac gaggccaaag 840 
agagggcgag cttcatcact cctgttcctg gcggcgtagg gcccatgaca gttgcaatgc 900 
tcatgcagag cacagtagag agtgccaagc gtttcctgga gaaatttaag ccaggaaagt 960 
ggatgattca gtataacaac cttaacctca agacacctgt tccaagtgac attgatatat 102 0 
cacgatcttg taaaccgaag cccattggta agctggctcg agaaattggt ctgctgtctg 108 0 
aagaggtaga attatatggt gaaacaaagg ccaaagttct gctgtcagca ctagaacgcc 1140 
tgaagcaccg gcctgatggg aaatacgtgg tggtgactgg aataactcca acacccctgg 12 0 0 
gagaagggaa aagcacaact acaatcgggc tagtgcaagc ccttggtgcc catctctacc 12 60 
agaatgtctt tgcgtgtgtg cgacagcctt ctcagggccc cacctttgga ataaaaggtg 13 2 0 
gcgctgcagg aggcggctac tcccaggtca ttcctatgga agagtttaat ctccacctca 1380 
caggtgacat ccatgccatc actgcagcta ataacctcgt tgctgcggcc attgatgctc 1440 
ggatatttca tgaactgacc cagacagaca aggctctctt taatcgtttg gtgccatcag 1500 
taaatggagt gagaaggttc tctgacatcc aaatccgaag gttaaagaga ctaggcattg 15 60 
aaaagactga ccctaccaca ctgacagatg aagagataaa cagatttgca agattggaca 162 0 
ttgatccaga aaccataact tggcaaagag tgttggatac caatgataga ttcctgagga 1680 
agatcacgat tggacaggct ccaacggaga agggtcacac acggacggcc cagtttgata 1740 
tctctgtggc cagtgaaatt atggctgtcc tggctctcac cacttctcta gaagacatga 1800 
gagagagact gggcaaaatg gtggtggcat ccagtaagaa aggagagccc gtcagtgccg 18 60 
aagatctggg ggtgagtggt gcactgacag tgcttatgaa ggacgcaatc aagcccaatc 192 0 
tcatgcagac actggagggc actccagtgt ttgtccatgc tggcccgttt gccaacatcg 1980 
cacatggcaa ttcctccatc attgcagacc ggatcgcact caagcttgtt ggcccagaag 2040 
ggtttgtagt gacggaagca ggatttggag cagacattgg aatggaaaag ttttttaaca 2100 
tcaaatgccg gtattccggc ctctgccccc acgtggtggt gcttgttgcc actgtcaggg 2160 
ctctcaagat gcacgggggc ggccccacgg tcactgctgg actgcctctt cccaaggctt 222 0 
acatacagga gaacctggag ctggttgaaa aaggcttcag taacttgaag aaacaaattg 2280 
aaaatgccag aatgtttgga attccagtag tagtggccgt gaatgcattc aagacggata 23 40 
cagagtctga gctggacctc atcagccgcc tttccagaga acatggggct tttgatgccg 2400 
tgaagtgcac tcactgggca gaagggggca agggtgcctt agccctggct caggccgtcc 2460 
agagagcagc acaagcaccc agcagcttcc agctccttta tgacctcaag ctcccagttg 252 0 
aggataaaat caggatcatt gcacagaaga tctatggagc agatgacatt gaattacttc 2580 
ccgaagctca acacaaagct gaagtctaca cgaagcaggg ctttgggaat ctccccatct 2640 
gcatggctaa aacacacttg tctttgtctc acaacccaga gcaaaaaggt gtccctacag 2700 
gcttcattct gcccattcgc gacatccgcg ccagcgttgg ggctggtttt ctgtacccct 2760 
tagtaggaac gatgagcaca atgcctggac tccccacccg gccctgtttt tatgatattg 2820 
atttggaccc tgaaacagaa caggtgaatg gattattcta aacagatcac catccatctt 2880 
caagaagcta ctttgaaagt ctggccagtg tctattcagg cccactggga gttaggaagt 2940 
ataagtaagc caagagaagt cagcccctgc ccagaagatc tgaaactaat agtaggagtt 3000 
tccccagaag tcattttcag ccttaattct catcatgtat aaattaacat aaatcatgca 3060 
tgtctgttta ctttagtgac gttccacaga ataaaaggaa acaagtttgc ca 3112 

<210> 10 

<211> 1792 

<212> DNA 

<213> Homo sapiens 

<400> 10 

cgcagcccag actcagactg gggaagcaaa caggggctgg acaggccagg agagcctgtc 60 
ggacagtgat cctgagatgt gggagttgct gcagagggag aaggacaggc agtgtcgtgg 12 0 
cctggagctc attgcctcag agaacttctg cagccgagct gcgctggagg ccctggggtc 180 
ctgtctgaac aacaagtact cggagggtta tcctggcaag agatactatg ggggagcaga 240 
ggtggtggat gaaattgagc tgctgtgcca gcgccgggcc ttggaagcct ttgacctgga 300 
tcctgcacag tggggagtca atgtccagcc ctactccggg tccccagcca acctggccgt 360 
ctacacagcc cttctgcaac ctcacgaccg gatcatgggg ctggacctgc ccgatggggg 42 0 
ccatctcacc cacggctaca tgtctgacgt caagcggata tcagccacgt ccatcttctt 480 



cgagtctatg ccctataagc tcaaccccaa aactggcctc attgactaca accagctggc 540 
actgactgct cgacttttcc ggccacggct catcatagct ggcaccagcg cctatgctcg 50 0 
cctcattgac tacgcccgca tgagagaggt gtgtgatgaa gtcaaagcac acctgctggc 660 
agacatggcc cacatcagtg gcctggtggc tgccaaggtg attccctcgc ctttcaagca 72 0 
cgcggacatc gtcaccacca ctactcacaa gactcttcga ggggccaggt cagggctcat 780 
cttctaccgg aaaggggtga aggctgtgga ccccaagact ggccgggaga tcctttacac 840 
atttgaggac cgaatcaact ttgccgtgtt cccatccctt caggggggcc cccacaatca 900 
tgccattgct gcagtagctg tggccctaaa gcaggcctgc acccccatgt tccgggagta 960 
ctccctgcag gttctgaaga atgctcgggc catggcagat gccctgctag agcgaggcta 102 0 
ctcactggta tcaggtggta ctgacaacca cctggtgctg gtggacctgc ggcccaaggg 1080 
cctggatgga gctcgggctg agcgggtgct agagcttgta tccatcactg ccaacaagaa 1140 
cacctgtcct ggagaccgaa gtgccatcac accgggcggc ctgcggcttg gggccccagc 12 00 
cttaacttct cgacagttcc gtgaggatga cttccggaga gttgtggact ttatagatga 1260 
aggggtcaac attggcttag aggtgaagag caagactgcc aagctccagg atttcaaatc 132 0 
cttcctgctt aaggactcag aaacaagtca gcgtctggcc aacctcaggc aacgggtgga 13 80 
gcagtttgcc agggccttcc ccatgcctgg ttttgatgag cattgaaggc acctgggaaa 1440 
tgaggcccac agactcaaag ttactctcct tccccctacc tgggccagtg aaatagaaag 1500 
cctttctatt ttttggtgcg ggagggaaga cctctcactt agggcaagag ccaggtatag 1560 
tctcccttcc cagaatttgt aactgagaag atcttttctt tttccttttt ttggtaacaa 1620 
gacttagaag gagggcccag gcactttctg tttgaacccc tgtcatgatc acagtgtcag 16 8 0 
agacgcgtcc tctttcttgg ggaagttgag gagtgccctt cagagccagt agcaggcagg 1740 
ggtgggtagg caccctcctt cctgttttta tctaataaaa tgctaacctg ca 1792 

<2ia> 11 

<211> 18596 

<212> DNA 

<213> Homo sapiens 

<400> 11 

cctgtagtcc cagctacgcg agaggctgag gcagcagaat tacttgaacc caggaggcgg 60 
aggttgcagt gagccgagat cgcgccactg cactccagcc tgggtgagag agcgagactc 12 0 
tgtctcaaaa aaaaaaaaaa aagaccgcca gggctcaaac aaaaaacctc ggaaaagccc 180 
tggcggtctt tttttttttt tttttttttt ttttttggga cagtcttgct ctgtcgccca 240 
ggctggagta caatggtcgg atcttggctc actgcaacct ctgcctccca ggttcaagca 300 
attcttctgc ctcagcctcc caagtagcca ccacgcccag ctaatttttg tacttttagt 360 
agagacgggg gtttcaccat gttgtccagg ctggtcttga actcctgacc tcaggtgatc 42 0 
cacccgcctc ggccccccaa agtactagga ttacaggcgt gagccaccgc gtccagcgcc 480 
ctggcggttt ttaatcaagt agaaaagctg cattatacca cttgcttcgg ttgcttcagt 540 
gagaacgaag aaatggaaat gcaaatccct tattagttgt aggaaacaga tctcaaacag 60 0 
cagttttgtt gacaagaccg caggaaaacg tgggaactgt gctgctggct tagagaaggc 660 
gcggtcgacc agacggttcc caaagggcgc agtccttccc agccaccgca cctgcatcca 72 0 
ggttcccggg tttcctaaga ctctcagctg tggccctggg ctccgttctg tgccacaccc 78 0 
gtggctcctg cgtttccccc tggcgcacgc tctctagagc gggggccgcc gcgaccccgc 840 
cgagcaggaa gaggcggagc gcgggacggc cgcgggaaaa ggcgcgcgga aggggtcctg 90 0 
ccaccgcgcc acttggcctg cctccgtccc gccgcgccac ttggcctgcc tccgtcccgc 96 0 
cgcgccactt cgcctgcctc cgtcccccgc ccgccgcgcc atgcctgtgg ccggctcgga 102 0 
gctgccgcgc cggcccttgc cccccgccgc acaggagcgg gacgccgagc cgcgtccgcc 1080 
gcacggggag ctgcagtacc tggggcagat ccaacacatc ctccgctgcg gcgtcaggaa 1140 
ggacgaccgc acgggcaccg gcaccctgtc ggtattcggc atgcaggcgc gctacagcct 12 0 0 
gagaggtgac gccgcgggcc cctgcgggac gggtggcggg aaggagggag gcgcggctgg 12 60 
ggagagcgct cgggagctgc cgggcgctgc ggaccccgtt tagtcctaac ctcaatcctg 132 0 
ccagggaggg gacgcatcgt cctcctcgcc ttacagacgc cgaaacggag ggtcccatta 1380 
gggacgtgac tggcgcgggc aacacacaca gcagcgacag ccgggaggta agccgcgtcc 1440 
cagcggctcc gcggccgggc tcgcagtcgc cccagtgatg ccgtggcccc cgaggcgggc 15 00 
gtcatcgggc agcgtttgcc cagtgctgga gggttaggga gagctgcctg ggcttgaccg 15 60 
cgcgccggtc tcaaagtcct ggctttggcc cctcctccgt tttcccctgt ggaccattcc 162 0 
gcttcgcagc gttttcaaaa actggagcga aagtgatgtg ggcggggcaa aggcggcggg 1680 



aagaggacag cactgaagct ggcgcgggaa cttggtttcc tggtggcctc ccatccaatc 1740 
cccacgaacc agctttcctc ttaaaccttg aaaagagaaa ttcgggagtt cgagttctta 1800 
gtcgtccttt cctctttcct ttccgacagg agcaccccag gcaaaaaatg tctcgcgggt 1860 
cattggcgcc aggctttcag gggacagtgg ggcggggcgg ggtgggcaca ggacgttagg 192 0 
cagccgttgg ccctccctaa ggccacaccg tcctgccgtc ctggatcctg cgccagctgc 1980 
gcgggggagg ggactcgaag gtgtgtgagc caggggctga ccttgaccgc tcagataaat 2040 
ggagcgcagc cttgacacag gggtggaggt ggttttgaat ggggaaaccc attcgtggtg 2100 
aagcagattc actgtagcta gcggaaaagc cctccggccc acggacccat ctagagacga 2160 
atacatagca gctgctgtgg ctgattggcg tgggacagcg tggggagttt tgtctgagga 222 0 
gagggatcca cttttctgca gctccaagcc caggggcctt tgatgagcca tagacctcat 2280 
ttttaaccca cctttctgct tagacattga gcaagttact tctcatatag cttccctata 2340 
tgttaaaaat ggagaaaata atgcttagta ggcaattctg ataaaagcag gtgcttgcaa 2400 
aaatctctct gttgtctgaa tataaactgt accacaagcg agtgcggatg aacgaggact 246 0 
gcatttaaag ataagttttt acactttcat ttctctgtgg ctcgacactt ctgatgcctc 2520 
cctttttgtt cctgggacac atgcttggtg ttgtcttcac acctttgtga caggattagc 2580 
actagtgggc agtggatgat agctcctcct cccttttgcc acatgttcat ccctgccctc 2 640 
gccaccatct cactgtgtgg aattcctgtg tccactggtc accggggcac agaagtgctg 2700 
tctcagcctg aatcgggcca ctgatgggac ttgcagcctg ggagctccac cgtgatctct 2760 
ggcccacttt gcgggagtct aggctttctg gatgctccag gcctcacgtc ccagggcagt 2 82 0 
tttcttccct gaagaaagtt ggatggcatg atctgtcttc ccatcttgaa accgtatggc 2880 
aaattgtttt tcagatgaat tccctctgct gacaaccaaa cgtgtgttct ggaagggtgt 2940 
tttggaggag ttgctgtggt ttatcaaggt aaagaagtcg ctgctattag aagtcagtag 3000 
tctgttctca acacagcagc cagtgagatc ctttcaaaac tcaaagcagc caggtgtggt 3060 
ggctcacgcc tgtaatccca ccgctttggg aggctgagtc agatcacctg aggttaggaa 312 0 
tttgggacca gcctggccaa catggcgaca ccccagtctc tactaataac acaaaaaatt 3180 
agccaggtgt gctggtgcat gtctgtaatc ccagctactc aggaggctga ggcatgagaa 3240 
ttgctcacga ggcggaggtt gtagtgagct gagatcgtgg cactgtactc cagcctggcg 3300 
acagagggag aacccatgtc aaaaacaaaa aaagacacca ccaaaggtca aagcatatca 3360 
ttcctcaccc tcaagccctt agtggctcca tttcactcag taagagccac ggtccttatg 3420 
gtgtccgttt ttcagctctg accttagctg ctgctctctg caccaccctg ctgttcttgt 3480 
gagtttttga gcacaccggg acatccccac tccctggaac cttcttcccc cacacttggc 3540 
ttcttccttt gagtctctac tccactcggg caagccttcc tagacctcct gatttaaaac 3600 
tgtgactctc ccccaacctc cttggtgttt ctccgtagac gaacatcacc atctgatgta 3 660 
tgtcagcctt tcccttcccc tgttagaagg gggacagcag gtagtaaaag tgaaatgtgc 372 0 
tgtaagcttt atgagggcag aggatttgtt tctcgtgttc actgttgtat cgccagggcc 3780 
tcaaacacag cctgccacat agtaggagtc aacatatatt gatcactaaa tgtagatacc 3 840 
acctgtgttc ccatgttcat ataaattcta gaagagtctc ttcagtaaca aggtgaaccc 3900 
cttccagagg gctgagtagg tacctcaggc cggggccaga gtgctgtgaa gacagcagca 3960 
gcccagacca agcttctctg tgttccgtgt cctggtctag aaccagcgat gttctttctg 402 0 
accagtgctt tttggaaggt ggctgaggtc tgggctcagg tctgggccat actagaagct 4080 
gggatccctt ctatagagca cttggtatgg cttgtatggt cttggggcaa gccagaccca 4140 
agccctctta tcccatttta gaaagggctt caatttggat ccagccccag gtctgcctta 4200 
gctctgtatt cttggggtat tttgttctgt attggcctat cttgactaac aatgagcctt 4260 
ggatttgaaa catatcatca gaaacctcag aagacaacat tcttaaactg gctagagcct 432 0 
ggtctgaatg gatgaaaagg agagactttt gaagcaatat gtaaaagatt gagaaatgat 438 0 
ttgttggaaa tttctcaatt ggagaaattt ctttgatttg ttggaaattt ctttgattct 4440 
ttctcaatca aagaaaatcg ggacaaactc aacaatagaa agggaggaag caagatactc 4500 
agaaataaaa tgcattcccc tgtttcaact taatgcttca attcaggatt ctaaggaatc 4560 
cttgccagga atgtcagact caccttgata gttggagtta ctccattggt gactcgatca 4 62 0 
aatacaggag ttgaggcacc tgcactgtaa aatactgatt agtctgatca ttaggaatat 4680 
cctgtatgcc aggtagaaga tacattgaac agattgcatg taggcattaa attcattttg 4740 
gggtattaca tatagacaac acatttcatt aagaaacata aaactgtcag atcggtggaa 4800 
tacttaaaag cacttggagg tgtttagcct aaaaagctta gttgagggga atggaagaaa 4860 
agatctggga gggtggttcc aaagaaggga tcagactatc ctaaagccct caggaatctg 492 0 
ggctgggacc acctacttaa agataggatg ggcagctggg tgtggtggct cacgcctgta 4980 
atcccagcac ttcgggaggc cgaagcgggc ggatcacctg aggtcaggag ttcgaggcca 5 040 
gcctgaccaa catggagaaa cgctgtctct actaaaaata caaaattagc tgggtgtagt 510 0 



ggcgcatgcc tgtaatccca gctactcggg aggctgaggc aggggaatcg cttgaacctg 5160 
ggaggtggag ggtgccgtga gccacgatcg cgccattgca ctccagcctg ggcaacaaga 522 0 
gcgaaactct caaaaaacaa aaaaaaggat gggttccata tgggtggtgt caagtgccca 52 80 
cctcctagca agtcagcagg ggccagaggc ccttgtaagt ggtgtctcgg ggggatcaac 5340 
tgagatggct taagatttac ctggatgcct gctctgctct ccccatctct tccagggatc 5400 
cacaaatgct aaagagctgt cttccaaggg agtgaaaatc tgggatgcca atggatcccg 545 0 
agactttttg gacagcctgg gattctccac cagagaagaa ggggacttgg gcccagttta 552 0 
tggcttccag tggaggcatt ttggggcaga atacagagat atggaatcag gtgaggagat 55 8 0 
agaacaatgc cttccatttc cgggtgccct tcctagcacg tgtttgctcc gttgttttag 5640 
ataaggtctg ggggatgagt caatgtcaca ggagctgatg tatagctttg accttgtgag 5700 
gggtggtgcc aggttgaagc cacaattaac gcctactgaa ggccgtttca catctttttt 5760 
tttttttttt ttttaattat tatactttaa gttttagggt acatgtgcac aatgtgcagg 582 0 
ttagttacat atgtatacat gtgccatgct ggtgcgctgc accactaact caccatctag 58 8 0 
catcaggtat atctcccaat gctatccctc ccccctcctc ccaccccaca acatccccag 5940 
agtgtgatgt tccccttcct gtgtccatat gttctcgttg ttcgattccc actatgagtg 6000 
agaatatgcg gtgtttggtt ttttgttctt gcgatagttt actgagaatg atgatttcca 6060 
tttcaccacg tccctacaga ggacatgaac tcatcatttt ttatggctgc atagtattcc 6120 
atggtgtata tgtgccacat tttcttaatc cagtctatca tgttggacat ttgggttggt 6180 
tccaagtctt tgcctattgt gaatagtgcc acaataaaca tacgtgtgca tgtgtcttta 6240 
tagcagcatg atttaatagt cctttgggta tatacccagt aatgggatgg ctgggtcaaa 63 0 0 
tggtatttct agttctagat ccccgaggaa tcgccacact gacttccaca atggttgaac 6360 
tagtttacag tcccaccaac agtgtcaaag tgtcctattt ctccacatcc tctccagcac 642 0 
ctgttgtttc ctgacttttt aatgattgcc attctaactg gtgtgagatg gtatctcatt 6480 
gtggttttga tttgcgtttc tctgatggcc agtgatggtg agcatttttt catgtgtttt 6540 
ttggctgcat aaatgtcttc ttttgagaag tgtctgttca tgtccttcgc ccactttttg 6600 
atggggttgt ttttttctta taaatttgtt tgagttcatt gtagattctg gatattagcc 6660 
ctttgtcaga tgagtaggtt gcaaaaatgt tctcccattt tgtgggttgc ctgttcactc 672 0 
tgatggtagt ttcttttgct gtgcagaagc tctttagttt aattagatcc catttgtcaa 6780 
ttttggcttt tgttgccatt gcttttggca taggcatgaa gtccttgccc atgcctatgt 6840 
cctgaatggt aatgcctagg ttttcttcta gggtttttat ggttttaggt ctaacgttta 6900 
agtctttaat ccatcttgaa ttgatttttg tataaggtgt aaggaaggga tccagtttca 6960 
gctttttaca tatggctagc cagttttccc agcaccattt attacatagg gaatcctttc 7020 
cccattgctt gtttttctca ggtttgtcaa agatcagata gttgtagata tgcggcgtta 7080 
tttctgaggg ctctgttctg ttccattgat ctatgtgtct gttttggtac cagtaccata 7140 
ctgttttggt tactgtagcc ttgtagtata gtttgaagtc aggtagcgtg atgcctccag 7200 
ctttgttctt ttggcttagg attgacttgg cgatgcgggc tcttttttgg ttccatatga 7260 
actttaaagt agttttttcc aattctgtga agaaagtcat tggtagcttg atggggatgg 732 0 
cattgaatct ataaattacc ttgggcagta tggccatttt cacgatattg attcttccta 7380 
cccatgagca tggaatggtc ttccatttct ttgtatcctc ttttatttca ttgagcagtg 7440 
gtttgtagtt ctccttgaag aggtccttca catccctttt aaggtggatt cctaggtatt 7500 
ttattctctt tgaagcaatt gtgagtggaa gttcactcat gatttggctc tctgtttgtc 7560 
tgttattggt gtataagaat gcttgtgatt tttgcagatt gattttatat cctgagactt 7620 
tgctgaagct gcttatcagc ttaaggagat tttgggctga gacaatgggg ttttctagat 7680 
atacaatcat gtcgtctgca aacagggaca atttgacttc ctcttttcct aattgaatac 7740 
cctttatttc cttctcctgc ctaattgccc tggccagaac ttccaacact atgttgaata 7800 
ggagtggtga gagagggcat ccctgtcttg tgccagtttt caaagggaat gcttccagtt 7860 
tttgcccatt cactatgata ttggctgtgg ctttgtcata gatagctctt attattttga 7920 
aatatgttcc atcaatacct aatttattga gagtttttag catgatgtgt tgttgaattt 7980 
tgtcaaaggc tttttctgca tctattgaga taatcatgtg gtttttgtct ttggatctgt 8040 
ttatatgctg gattacattt attgatttgc gtatattgaa ccagccttgc atcctaggga 8100 
tgaagcccac atgatcatgg tggataagct ttttgatgtg ctgctggatt cggtttgcca 8160 
gtattttatt gaggattttt gcatcaatgt tcatcaagga tattggtcta aaattctctt 8220 
ttttggtgtg tctctgccca gctttggtat caggatgatg ttggcttcat aaaatgagtt 8280 
agggaggatt ccctcttttt ctattgattg gaatagtttc agaaggaatg gtaccagttc 8340 
ctctttgtac ctctggagaa ttcggctgtg aatccatctg gtcctggact ctctttggtt 8400 
ggtaagctat tgattattgc cacaatttca gctcctgtta ttggtctatt cagagattca 8460 
acttcttcct ggtttagtct tgggagagtg tatgtgtcaa ggaatttatc catttcttct 8520 



agattttcta gtttatttgc gtagaggtgt 
tctgtgggat cggtggtgat atccccttta 
tctttttctt tattagtctt gctagcggtc 
agctcctgga ttcattaatt ttttgaaggg 
ctctgatttt agttatttct tgccttctgc 
ctagttcttt taattgtgat gttagggtgt 
gggcatttag tgctataaat ttccctctac 
ggtatgttgt gtctttgttc ttgttggttt 
cgttatgtac ccagtagtca ttcaggagca 
tttgagtgag attcttaatc ctgagttcta 
ttgttataat ttctgttctt ttacatttgc 
cggttttgga ataggtgtgg tgtggtgctg 
ggagttctgt agatgtctat taggtctgct 
tccttgttga ctttctgtct cgttgatctg 
ccattattaa tgtgtggagt ctaagtctct 
tgggcgcttg gcactttcca tactgtgtca 
gctggggaat gggaagttca tcggtgggac 
ggctttaatc tccctttcga ggctgagcca 
cgcagtacca gggtcacact ccactcccag 
ctcactcctg gacttccact ttcctgtcac 
gagcccagcc cctcccacct gtgcctagga 
tgtgtgtgca atttactagc tgggcagaga 
aaggggacag tcacatttca cctccagcca 
aattcttcaa taaaagccta aaatctatat 
ttagcaactt caggaagttt aaaaatgctg 
gcagaaagtg gaatgctagt ttccagggac 
tttaatgggc acagaggttt tgttagggat 
atggagatgg tgatggtgat ggagatggtg 
gatggtgatg gtgatggtga tggagatggt 
gatggtgatg gtgatggaga tggtgatggt 
gatggtgatg gtgatggtga tggtgatggt 
gatggtgatg gttgcctaac atcaggaacg 
tggcaagttt aatattatgt gtactttatc 
ttacttgtgc aggtaatgtt ctgcaggtgg 
aggatgtgag gccagtcccc gggcttaatg 
tttcttgtcg cttaaaaggc ctaataaaat 
ttagctgttg gattttagta ggaaagttcg 
attcacagga attttttttt tttttttttt 
tggactctct acccagtttc cccagtgata 
cattggtgca ttcctcagag ttgtcagatt 
tgtatttgca attttagcac gtgtagactc 
acactaccaa ggttcatctt tttaaaatct 
tgagaggact ttcctcccaa aattttgaaa 
ataggtagaa atttaaccaa aggagattat 
ttctcagctt ccagttccat ctcagaagga 
aaaatgggag caaagtacaa ggtggtgtaa 
tcagtcgtgt gttgagctgc taaactctaa 
ctgctgaggt gatagaaagg aatccatggt 
tatattctac ataagataca atactctctg 
gaaattgtga caagaatcgc tgatgggttt 
aaattaagat tgttgagatt ggaaagggtt 
atgacagacc aaatattcaa aggactgtgt 
agaggacaga gccaggagtc agacagggcc 
ttgcaggatg caccagatgt cttgtagcca 
tgtaatagat gacctctaag gccatctcat 
tctctttttg acaattctac agattattca 
attgacacca tcaaaaccaa ccctgacgac 



ttgtagtaat ctctgatggt agtttgtatt 8580 
tcatttttta ttgcgtctat ttgattcttc 8640 
tataaatttt gttgatcctt tcaaaaaacc 8700 
ttttttgtgt ctctatttcc ttcagttctg 8760 
tagcttttga atatgtttgc tcttgctttt 8820 
caattttgga tctttcctgc tttctcttgt 8880 
acactgcttt gaatgtgtcc cagaggttct 894 0 
caaagaacat ctttatttct gccttcattt 9000 
ggttgttcag tttccatgta gttgagcagt 9060 
gtttgattgc actgtggtct gagagatagt 9120 
tgaggagagc tttacttcca actatgtggt 9180 
aaaaaaatgt atattctgtt gatttgggat 9240 
tggtgcagag ctgagttcaa ttcctgggta 93 0 0 
tgtactgttg acagtgggtg ttaaagtctc 93 6 0 
ttgtaggtca ctcagatgat tggcacttac 942 0 
tcggcagata gctgcatggt tggtgttcgt 948 0 
aaggacaaaa tgcccccatt gctttgttgt 9540 
cagcgtgctg taggtggcgc tgctgtgaag 9600 
ctctgcagag gtggagaaag aatgaaacat 966 0 
tgttggtgtc acctcttact ggatgtcaca 972 0 
aaagcagatg ccaccttgga atgtggggtt 9780 
ccagcaacct ggagagcagg tgtctcgtct 9840 
cctggaggaa tttgggcctg gtgatgtcag 9900 
tttatgtgcg gtcatgagat ctgttaaatg 9960 
tgtggaccta gaataggcaa gttcttaaag 10020 
tggggaacag ggaggaatgg ggagttcatg 10080 
gacgaaaaag ttcgggagat ggtgatggtg 10140 
atggtgatgg tgatggtgat gggtgatggt 10200 
gatggtgatg gtgatggaga tggtgatggt 102 60 
gatggagatg gtgatggtga tggtgatgga 103 2 0 
gatggtgatg gtgatggaga tggagatggt 10380 
tgcttaatgc ttctgaattg cacacaaaaa 10440 
acaatgaaaa aagctgctgc gtgggccaag 10500 
ttgcctgcac ctcagttgta gggtgtccgt 10560 
atgctttaaa tcctgcctag tattcaatta 10620 
tatggtctta gtttacagtg gtatgaatgc 10680 
tccctttttg tttttaattt tgttttacag 10740 
tttttttttt taatgcacag aaagtttccc 10800 
atatcttggg taacatcctg tatacattca 10860 
ttgctagttt tacgtgcact tgtgtatgtg 10920 
ttgtaaccac tacaatcaag ttacagaact 10980 
ttgatgttac cttttttgga acagtgacca 11040 
actactgaac cagaatatag tctgacacta 11100 
gaagctctgc acttgagtta acaaaatcac 11160 
aggaaaaggg attaaaaatc cagagaccag 1122 0 
tcattacaga ggtttcctga tgtttccaag 112 80 
agtaatttta ggtggaatgt tggaaacatg 11340 
cctctgttag ttggaaagta tatggaatac 11400 
tgagacaagg ataaagtaga ttttgtcagt 11460 
agagcctaag tttgcgagga gcactggaag 1152 0 
agctatgggg gaacaggagg aggtgactcc 11580 
agaagaggaa aaagactttg ttagggctcc 11640 
ttgaactcaa cccaccgaga tctgcaaact 117 00 
tgggtcaagg ggggaccctg ggtaagagac 117 60 
gacatgtgtg attaatgtat gtacctgtcc 1182 0 
ggacagggag ttgaccaact gcaaagagtg 1188 0 
agaagaatca tcatgtgcgc ttggaatcca 11940 



agaggttgaa agaaccccgt cgtcttcatt tatactaacc atactcttag agggaagcaa 12 000 
tctggttttg tgcagaggca ctgagggagg caggaccctg ggcaacttcc cccagccaca 12 060 
tggttgtgtg acgttgggca agtcacattt tgctgcactt tcaccttcag atcatgaggt 12120 
tgggcccaga ggattttttt tttttttttt ttttttgaga cagagttttg ctctgttgcc 12180 
caggctggaa tgcaacggcg tgatcttggc tcactgtaac ctctgcctcc tgggttcgag 12240 
tgattctcct gcctcagcct ccaagtagct gggattacag catgtgccac catgcctggc 123 00 
taattttgta tttttagtag agacgggttc acatgttggt caggctggtc ttgactcctg 123 60 
accctcagat gatctgcctt gcctcagcct cccaaccgag tgatcttaag ttgtgtatta 12420 
tactcattct tacacaaaaa gggctttaaa tgcctagaaa ctacatgaag atgttaacat 12480 
tttaaatgga agcagatgaa gttccagctc gctgccacct cactaacatt tttaacaatt 12 540 
atattgtaaa attcaactct accagggtgt agagccaggt gtggtggctc acacctgtaa 12600 
ttccaacaac tccagaggcc aaggcgagag gatcatttga acccacggaa tttgaggctg 12660 
tagtgagtca tgatcacgcc attgcactcc atcctgggca acagagtgag accctgaata 12 72 0 
tttaaaaaca acaacaacaa caaaactcta tcaggatatc ataagtactt agagtgaaat 12780 
acttgcatct gtaatagaga cttatttttt ttttttttga gacacagtct caccctgttg 12840 
cccaggctgg agtgcagtgg tttgatctcc gctcacggca acctccatct cccaggttca 12 900 
agtgagttcc cattcctcag ccccagagct gggaccacag gcgcgcgaat ttttgtattt 12960 
ttagcagaga cggggtttca ctatgttggc caggctagtc tcaaactcaa gttggcctca 13 020 
agtgatctgc ccaccctggc gtcccagtgt tgggatttca ggcatgagcc actgtgcctg 13 080 
gccatgtaat agagactttt aatataggag ggtgtaccag aagcaccagt ttcctgtggc 13140 
aaacagaatt attcctgctg tatttgtaat ttggtgccac gaggtagccc agatcccttc 13200 
agctctgatg gaagagcatt gcttcagccg taaatggaca cctgcagaaa ccttgcaccg 13260 
atggatagtc tccctcagct ccgtgccatc gctgcagggg ctgttatgga catcactgca 13320 
gcccagtggc tctctctcct ggtctccacc atatgagttg gcttctgttt ctctcctgtt 13380 
ttactttgcc tttagctgtg gtctttcaaa ccaccatccc tccttatctt cctctgctgg 13440 
ttcctcagat cttcctctga tggcgctgcc tccatgccat gccctctgcc agttctatgt 13500 
ggtgaacagt gagctgtcct gccagctgta ccagagatcg ggagacatgg gcctcggtgt 13 560 
gcctttcaac atcgccagct acgccctgct cacgtacatg attgcgcaca tcacgggcct 13620 
gaaggtgggc tgtctcggga agggtgactt gccagcctac cacatgagct cttcagttct 13680 
ttaatatggg aaaacaaatt gcagagttta gtctctgatt agcttttaaa tttgatatgt 13740 
gtaagtaaga catgaaccag cttttacttt gaaaccttcc ttttctggaa ggttttctgg 13800 
ccctgtggta tatgcactaa cagatctata caggttgttt gtgatacagc ttctatggat 13 860 
cttctcaaaa gctatgctga ggttgggtat ggtggctcat gcctgtaatc ccagcacttt 13 92 0 
ggaagactga gacaggagca attgcttgag gtctggagtt caataccagc ctgggcaaca 13980 
taacaagatg ctgttgctac aaaaaaatgg aaaagctaca ctaaattatt tttttaaaaa 14040 
aagccttgcg gtgtctgcat attctaatgt ttttaaatga tgttttaaag aattgaaact 14100 
aacatactgt tctgctttct cccggtttat agccaggtga ctttatacac actttgggag 14160 
atgcacatat ttacctgaat cacatcgagc cactgaaaat tcaggtaaga attagatgtt 1422 0 
atacttttgg gtttggtacc ttctcttgat aaaaggttga ctgtggaaca ggtatctgct 14280 
caatgctgtg tccaagataa agatgactgc tccaaatgtg gggcttcagt ttagggagaa 14340 
gtggtgggca ggtgggcagg acaaggcagg catctgcctc agcaaccatg gcacttaact 14400 
tgtcaggtgc tgtgaggtac taagcaccag taccagagag ggaagagcca cattcaagcc 14460 
aggggattgt ccaaaaggag gcattttaac tcattttaac ttgaaggaga attgaagtgc 1452 0 
aaatgttttt ccttttcttt ttttttgaga tggagtcttt ctctgtcggc caggctggag 14580 
tgtgccgtgg tgcgatctca gctcactgca acctccacct cccgggttca agcaattctt 14640 
ctgcctcagc ctcccaggta gctgggatta caggcacatg ccaccacacc cagctaattt 14700 
tttgtattat tagtagagat ggggtttcgt catgttggcc aggctgatct caaactcctg 14760 
acttcaagtg taccacctgc ctcagcctcc gaaagttctg gaattacagg cataagccac 14820 
caccctggcc ataaatattt tttgttaatt ttacattaag tacaatattt aggtccaaac 14880 
ttcaaaagtc tgttgaaatc cctgaagtta tagcagccaa caattgatat gaaatggcaa 14940 
taaaaatgta agttcatctg cttcatgagc cttaaggaaa aaaactcaga accagacact 15000 
ttttagcccc ttccaggtta gatccaggtt ttaaaagtta ttcctttgag ggagtttggc 15060 
tgcttttgag tggaggtgac ttcaggctta ttctctctgg ctctctgctc tggtcatttt 15120 
tagacatagt aataggttgt gacctgtctt cacatcctaa ttgccactgt ctgttcatcc 15180 
caggaatcct ggctttcatc cctttctgtt cactgtccat gcatgtcatc tttccttctt 15240 
tctgccaggg accagatggg ttagggattg tgaattcaag taaacgtaga gctactatga 15300 
gttacagatt gactgtgttc ctgtctttaa taaatttgcc aagagtggtt ataagaactt 15360 



acacctgatg aggcaccagg ctcctgatgc tgtgtaatgt cacaaaatac ccctcactct 1542 0 
cgatctgtgc aagagaacag ctggttgcgc tccaatcatg ttacataacc tacgcgaagg 15480 
tatcgacagg atcatactcc tgtaaaatag aactttgttg atcacatcct gtgtacttgt 15540 
ttcacggaca tgaggagcaa ttacaacagg tcgtacaatt atggcaaaat aatggcctta 15600 
ttttgttttt agcttcagcg agaacccaga cctttcccaa agctcaggat tcttcgaaaa 15660 
gttgagaaaa ttgatgactt caaagctgaa gactttcaga ttgaagggta caatccgcat 15720 
ccaactatta aaatggaaat ggctgtttag ggtgctttca aaggagctcg aaggatattg 15780 
tcagtcttta ggggttgggc tggatgccga ggtaaaagtt ctttttgctc taaaagaaaa 15840 
aggaactagg tcaaaaatct gtccgtgacc tatcagttat taatttttaa ggatgttgcc 159 00 
actggcaaat gtaactgtgc cagttctttc cataataaaa ggctttgagt taactcactg 15960 
agggtatctg acaatgctga ggttatgaac aaagtgagga gaatgaaatg tatgtgctct 16020 
tagcaaaaac atgtatgtgc atttcaatcc cacgtactta taaagaaggt tggtgaattt 16080 
cacaagctat ttttggaata tttttagaat attttaagaa tttcacaagc tattccctca 16140 
aatctgaggg agctgagtaa caccatcgat catgatgtag agtgtggtta tgaactttaa 16200 
agttatagtt gttttatatg ttgctataat aaagaagtgt tctgcattcg tccacgcttt 16260 
gttcattctg tactgccact tatctgctca gttccttcct aaaatagatt aaagaactct 16320 
ccttaagtaa acatgtgctg tattctggtt tggatgctac ttaaaagagt atattttaga 16380 
aataatagtg aatatatttt gccctatttt tctcatttta actgcatctt atcctcaaaa 16440 
tataatgacc atttaggata gagttttttt tttttttttt taaactttta taaccttaaa 16500 
gggttatttt aaaataatct atggactacc attttgccct cattagcttc agcatggtgt 165 50 
gacttctcta ataatatgct tagattaagc aaggaaaaga tgcaaaacca cttcggggtt 16620 
aatcagtgaa atatttttcc cttcgttgca taccagatac ccccggtgtt gcacgactat 16680 
ttttattctg ctaatttatg acaagtgtta aacagaacaa ggaattattc caacaagtta 16740 
tgcaacatgt tgcttatttt caaattacag tttaatgtct aggtgccagc ccttgatata 16800 
gctatttttg taagaacatc ctcctggact ttgggttagt taaatctaaa cttatttaag 16860 
gattaagtag gataacgtgc attgatttgc taaaagaatc aagtaataat tacttagctg 16920 
attcctgagg gtggtatgac ttctagctga actcatcttg atcggtagga ttttttaaat 16980 
ccatttttgt aaaactattt ccaagaaatt ttaagccctt tcacttcaga aagaaaaaag 17040 
ttgttggggc tgagcactta attttcttga gcaggaagga gtttcttcca aacttcacca 17100 
tctggagact ggtgtttctt tacagattcc tccttcattt ctgttgagta gccgggatcc 17160 
tatcaaagac caaaaaaatg agtcctgtta acaaccacct ggaacaaaaa cagattttat 17220 
gcatttatgc tgctccaaga aatgctttta cgtctaagcc agaggcaatt aattaatttt 172 80 
tttttttttg acatggagtc actgtccgtt gcccaggctg cagtgcagtg gcgcaatctt 17340 
ggctcactgc aacctccacc tcccaggttc aagtgattct cctgcctcag cctcccatgt 17400 
agctgggatc acaggcacct gccaccatgc ccggctaatt ttttgtattt tttgtagaga 17460 
cagggtttca ccatgttggc caggctggtc tcaaacacct gacctcaaat gatccacctg 17520 
cctcagcctc ccaaagtgtt gggattacag gcgtaagcca ccatgcccag ccctgaatta 17580 
atatttttaa aataagtttg gagactgttg gaaataatag ggcagaggaa catattttac 17640 
tggctacttg ccagagttag ttaactcatc aaactctttg ataatagttt gacctctgtt 17700 
ggtgaaaatg agccatgatc tcttgaacat gatcagaata aatgccccag ccacacaatt 17760 
gtagtccaaa ctttttaggt cactaacttg ctagatggtg ccaggttttt ttgcacaagg 17820 
agtgcaaatg ttaagatctc cactagtgag gaaaggctag tattacagaa gccttgtcag 178 80 
aggcaattga acctccaagc cctggccctc aggcctgagg attttgatac agacaaactg 17940 
aagaaccgtt tgttagtgga tattgcaaac aaacaggagt caaagcttgg tgctccacag 18000 
tctagttcac gagacaggcg tggcagtggc tggcagcatc tcttctcaca ggggccctca 18060 
ggcacagctt accttgggag gcatgtagga agcccgctgg atcatcacgg gatacttgaa 18120 
atgctcatgc aggtggtcaa catactcaca caccctagga ggagggaatc agatcggggc 18180 
aatgatgcct gaagtcagat tattcacgtg gtgctaactt aaagcagaag gagcgagtac 18240 
cactcaattg acagtgttgg ccaaggctta gctgtgttac catgcgtttc taggcaagtc 183 00 
cctaaacctc tgtgcctcag gtccttttct tctaaaatat agcaatgtga ggtggggact 183 60 
ttgatgacat gaacacacga agtccctctg agaggttttg tggtgccctt taaaagggat 18420 
caattcagac tctgtaaata tccagaatta tttgggttcc tctggtcaaa agtcagatga 18480 
atagattaaa atcaccacat tttgtgatct atttttcaag aagcgtttgt attttttcat 18540 
atggctgcag cagctgccag gggcttgggg tttttttggc aggtagggtt gggagg 18596 



<210> 12 
<211> 3291 



<212> DNA 

<213> Homo sapiens 



<400> 12 

accgggcaag cgggaaccag gtggccaccc ggtgtcggtt tcattttcct ttggaatttc 60 
tgctttacag acagaacaat ggcagcccga gtacttataa ttggcagtgg aggaagggaa 12 0 
catacgctgg cctggaaact tgcacagtct catcatgtca aacaagtgtt ggttgcccca 180 
ggaaacgcag gcactgcctg ctctgaaaag atttcaaata ccgccatctc aatcagtgac 240 
cacactgccc ttgctcaatt ctgcaaagag aagaaaattg aatttgtagt tgttggacca 300 
gaagcacctc tggctgctgg gattgttggg aacctgaggt ctgcaggagt gcaatgcttt 360 
ggcccaacag cagaagcggc tcagttagag tccagcaaaa ggtttgccaa agagtttatg 42 0 
gacagacatg gaatcccaac cgcacaatgg aaggctttca ccaaacctga agaagcctgc 480 
agcttcattt tgagtgcaga cttccctgct ttggttgtga aggccagtgg tcttgcagct 540 
ggaaaagggg tgattgttgc aaagagcaaa gaagaggcct gcaaagctgt acaagagatc 60 0 
atgcaggaga aagcctttgg ggcagctgga gaaacaattg tcattgaaga acttcttgac 660 
ggagaagagg tgtcgtgtct gtgtttcact gatggcaaga ctgtggcccc catgccccca 72 0 
gcacaggacc ataagcgatt actggaggga gatggtggcc ctaacacagg gggaatggga 780 
gcctattgtc cagcccctca ggtttctaat gatctattac taaaaattaa agatactgtt 840 
cttcagagga cagtggatgg catgcagcaa gagggtactc catatacagg tattctctat 90 0 
gctggaataa tgctgaccaa gaatggccca aaagttctag agtttaattg ccgttttggt 960 
gatccagagt gccaagtaat cctcccactt cttaaaagtg atctttatga agtgattcag 102 0 
tccaccttag atggactgct ctgcacatct ctgcctgttt ggctagaaaa ccacaccgcc 108 0 
ctaactgttg tcatggcaag taaaggttat cctggagact acaccaaggg tgtagagata 1140 
acagggtttc ctgaggctca agctctagga ctggaggtgt tccatgcagg cactgccctc 12 0 0 
aaaaatggca aagtagtaac tcatgggggt agagttcttg cagtcacagc catccgggaa 12 6 0 
aatctcatat cagcccttga ggaagccaag aaaggactag ctgctataaa gtttgaggga 132 0 
gcaatttata ggaaagacgt cggctttcgt gccatagctt tcctccagca gcccaggagt 13 8 0 
ttgacttaca aggaatctgg agtagatatc gcagctggaa atatgctggt caagaaaatt 1440 
cagcctttag caaaagccac ttccagatca ggctgtaaag ttgatcttgg aggttttgct 15 0 0 
ggtctttttg atttaaaagc agctggtttc aaagatcccc ttctggcctc tggaacagat 15 6 0 
ggcgttggaa ctaaactaaa gattgcccag ctatgcaata aacatgatac cattggtcaa 162 0 
gatttggtag caatgtgtgt taatgatatt ctggcacaag gagcagagcc cctcttcttc 168 0 
cttgattact tttcctgtgg aaaacttgac ctcagtgtaa ctgaagctgt tgttgctgga 1740 
attgctaaag cttgtggaaa agctggatgt gctctccttg gaggtgaaac agcagaaatg 180 0 
cctgacatgt atccccctgg agagtatgac ctagctgggt ttgccgttgg tgccatggag 18 6 0 
cgagatcaga aactccctca cctggaaaga atcactgagg gtgatgttgt tgttggaata 192 0 
gcttcatctg gtcttcatag caatggattt agccttgtga ggaaaatcgt tgcaaaatct 1980 
tccctccagt actcctctcc agcacctgat ggttgtggtg accagacttt aggggactta 2 040 
cttctcacgc ctaccagaat ctacagccat tcactgttac ctgtcctacg ttcaggacat 2100 
gtcaaagcct ttgcccatat tactggtgga ggattactag agaacatccc cagagtcctc 2160 
cctgagaaac ttggggtaga tttagatgcc cagacctgga ggatccccag ggttttctca 2220 
tggttgcagc aggaaggaca cctctctgag gaagagatgg ccagaacatt taactgtggg 228 0 
gttggcgctg tccttgtggt atcaaaggag cagacagagc agattctgag ggatatccag 2340 
cagcacaagg aagaagcctg ggtgattggc agtgtggttg cacgagctga aggttcccca 2400 
cgtgtgaaag tcaagaatct gattgaaagc atgcaaataa atgggtcagt gttgaagaat 2460 
ggctccctga caaatcattt ctcttttgaa aaaaaaaagg ccagagtggc tgtcttaata 252 0 
tctggaacag gatcgaacct gcaagcactt atagacagta ctcgggaacc aaatagctct 2580 
gcacaaattg atattgttat ctccaacaaa gccgcagtag ctgggttaga taaagcggaa 2 640 
agagctggta ttcccactag agtaattaat cataaactgt ataaaaatcg tgtagaattt 2700 
gacagtgcaa ttgacctagt ccttgaagag ttctccatag acatagtctg tcttgcagga 27 60 
ttcatgagaa ttctttctgg cccctttgtc caaaagtgga atggaaaaat gctcaatatc 2820 
cacccatcct tgctcccttc ttttaagggt tcaaatgccc atgagcaagc cctggaaacc 2880 
ggagtcacag ttactgggtg cactgtacac tttgtagctg aagatgtgga tgctggacag 2940 
attattttgc aagaagctgt tcccgtgaag aggggtgata ctgtcgcaac tctttctgaa 3000 
agagtaaaat tagcagaaca taaaatattt cctgcagccc ttcagctggt ggccagtgga 3 060 
actgtacagc ttggagaaaa tggcaagatc tgttgggtta aagaggaatg aagcctttta 312 0 
attcagaaat ggggccagtt tagaaagaat tatttgctgt ttgcatggtg gttttttatc 3180 



atggacttgg cccaaaagaa aaactgctaa aagacaaaaa agacctcacc cttacttcat 3240 
ctattttttt aataaataga gactcactaa aaaaaaaaaa aaaaaaaaaa a 32 91 



<210> 13 

<211> 1776 

<212> DNA 

<213> Homo sapiens 

<400> 13 

atggtgccct ccagcccagc ggtggagaag caggtgcccg tggaacctgg gcctgacccc 60 
gagctccggt cctggcggcg cctcgtgtgc tacctttgct tctacggctt catggcgcag 12 0 
atacggccag gggagagctt catcaccccc tacctcctgg ggcccgacaa gaacttcacg 18 0 
cgggacgagg tcacgaacga gatcacgccg gtgctgtcgt actcctacct ggccgtgctg 240 
gtgcccgtgt tcctgctcac cgactacctg cgctacacgc cggtgctgct gctgcagggg 300 
ctcagcttcg tgtcggtgtg gctgctgctg ctgctgggcc actcggtggc gcacatgcag 360 
ctcatggagc tcttctacag cgtcaccatg gccgcgcgca tcgcctattc ctcctacatc 42 0 
ttctctctcg tgcggcccgc gcgctaccag cgtgtggccg gctactcgcg cgctgcggtg 48 0 
ctgctgggcg tgttcaccag ctccgtgctg ggccagctgc tggtcactgt gggccgagtc 540 
tccttctcca cgctcaacta catctcgctg gccttcctca ccttcagcgt ggtcctcgcc 600 
ctcttcctga agcgccccaa gcgcagcctc ttcttcaacc gcgacgaccg ggggcggtgc 660 
gaaacctcgg cttcggagct ggagcgcatg aatcctggcc caggcgggaa gctgggacac 72 0 
gccctgcggg tggcctgtgg ggactcagtg ctggcgcgga tgctgcggga gctgggggac 78 0 
agcctgcggc ggccgcagct gcgcctgtgg tccctctggt gggtcttcaa ctcggccggc 840 
tactacctgg tggtctacta cgtgcacatc ctgtggaacg aggtggaccc caccaccaac 90 0 
agtgcgcggg tctacaacgg cgcggcagat gctgcctcca cgctgctggg cgccatcacg 96 0 
tccttcgccg cgggcttcgt gaagatccgc tgggcgcgct ggtccaagct gctcatcgcg 102 0 
ggcgtcacgg ccacgcaggc ggggctggtc ttccttctgg cgcacacgcg ccacccgagc 1080 
agcatctggc tgtgctatgc ggccttcgtg ctgttccgcg gctcctacca gttcctcgtg 1140 
cccatcgcca cctttcagat tgcatcttct ctgtctaaag agctctgtgc cctggtcttc 1200 
ggggtcaaca cgttctttgc caccatcgtc aagaccatca tcactttcat tgtctcggac 1260 
gtgcggggcc tgggcctccc ggtccgcaag cagttccagt tatactccgt gtacttcctg 132 0 
atcctgtcca tcatctactt cttgggggcc atgctggatg gcctgcgcga ctgccagcgg 13 8 0 
ggccaccacc cgcggcagcc cccggcccag ggcctgagga gtgccgcgga ggagaaggca 144 0 
gcacagcgac tgagcgtgca ggacaagggc ctcggaggcc tgcagccagc ccagagcccg 150 0 
ccgctttccc cagaagacag cctgggggct gtggggccag cctccctgga gcagagacag 15 6 0 
agcgacccat acctggccca ggccccggcc ccgcaggcag ctgaattcct gagcccagtg 1620 
acaacccctt ccccctgcac tctgtcgtcc gcccaagcct caggccctga ggctgcagat 1680 
gagacttgtc cccagctggc tgtccatcct cctggtgtca gcaagctggg tttgcagtgt 1740 
cttccaagcg acggtgttca gaatgtgaac cagtga 1776 

<210> 14 

<211> 2500 

<212> DNA 

<213> Homo sapiens 

<400> 14 

tgaatcgccc ggggtcgccg tctccgcctc gccgcagtcg gggcagccgc tgccctcttt 60 
tccatgtatc gtccaggatc ccatgacaga ttctgttgtc acgtctcctt acagagtttg 12 0 
agcggtgctg aactgtcagc acatctgtcc ggtccagcat gccttctgag accccccagg 18 0 
cagaagtggg gcccacaggc tgcccccacc gctcagggcc acactcggcg aaggggagcc 240 
tggagaaggg gtccccagag gataaggaag ccaaggagcc cctgtggatc cggcccgatg 300 
ctccgagcag gtgcacctgg cagctgggcc ggcctgcctc cgagtcccca catcaccaca 3 60 
ctgccccggc aaaatctcca aaaatcttgc cagatattct gaagaaaatc ggggacaccc 42 0 
ctatggtcag aatcaacaag attgggaaga agttcggcct gaagtgtgag ctcttggcca 48 0 
agtgtgagtt cttcaacgcg ggcgggagcg tgaaggaccg catcagcctg cggatgattg 540 
aggatgctga gcgcgacggg acgctgaagc ccggggacac gattatcgag ccgacatccg 600 
ggaacaccgg gatcgggctg gccctggctg cggcagtgag gggctatcgc tgcatcatcg 66 0 



tgatgccaga gaagatgagc tccgagaagg tggacgtgct gcgggcactg ggggctgaga 72 0 
ttgtgaggac gcccaccaat gccaggttcg actccccgga gtcacacgtg ggggtggcct 78 0 
ggcggctgaa gaacgaaatc cccaattctc acatcctaga ccagtaccgc aacgccagca 840 
accccctggc tcactacgac accaccgctg atgagatcct gcagcagtgt gatgggaagc 90 0 
tggacatgct ggtggcttca gtgggcacgg gcggcaccat cacgggcatt gccaggaagc 960 
tgaaggagaa gtgtcctgga tgcaggatca ttggggtgga tcccgaaggg tccatcctcg 102 0 
cagagccgga ggagctgaac cagacggagc agacaaccta cgaggtggaa gggatcggct 108 0 
acgacttcat ccccacggtg ctggacagga cggtggtgga caagtggttc aagagcaacg 1140 
atgaggaggc gttcaccttt gcccgcatgc tgatcgcgca agaggggctg ctgtgcggtg 1200 
gcagtgctgg cagcacggtg gcggtggccg tgaaggctgc gcaggagctg caggagggcc 12 60 
agcgctgcgt ggtcattctg cccgactcag tgcggaacta catgaccaag ttcctgagcg 132 0 
acaggtggat gctgcagaag ggctttctga aggaggagga cctcacggag aagaagccct 13 8 0 
ggtggtggca cctccgtgtt caggagctgg gcctgtcagc cccgctgacc gtgctcccga 1440 
ccatcacctg tgggcacacc atcgagatcc tccgggagaa gggcttcgac caggcgcccg 15 00 
tggtggatga ggcgggggta atcctgggaa tggtgacgct tgggaacatg ctctcgtccc 1560 
tgcttgccgg gaaggtgcag ccgtcagacc aagttggcaa agtcatctac aagcagttca 162 0 
aacagatccg cctcacggac acgctgggca ggctctcgca catcctggag atggaccact 1680 
tcgccctggt ggtgcacgag cagatccagt accacagcac cgggaagtcc agtcagcggc 1740 
agatggtgtt cggggtggtc accgccattg acttgctgaa cttcgtggcc gcccaggagc 18 0 0 
gggaccagaa gtgaagtccg gagcgctggg cggtgcggag cgggcccgcc acccttgccc 1860 
acttctcctt cgctttcctg agccctaaac acacgcgtga ttggtaactg cctggcctgg 1920 
caccgttatc cctgcagacg gcacagagca tccgtctccc ctcgttaaca catggcttcc 1980 
taaatggccc tgtttacggc ctatgagatg aaatatgtga ttttctctaa tgtaacttcc 2 040 
tcttaggatg tttcaccaag gaaatattga gagagaagtc ggccaggtag gatgaacaca 2100 
ggcaatgact gcgcagagtg gattaaaggc aaaagagaga agagtccagg aaggggcggg 2160 
gagaagcctg ggtggctcag catcctccac gggctgcgcg tctgctcggg gctgagctgg 222 0 
cgggagcagt ttgcgtgttt gggtttttta attgagatga aattcaaata acctaaaaat 22 8 0 
caatcacttg aaagtgaaca atcagcggca tttagtacat ccagaaagtt gtgtaggcac 2340 
cacctctgtc acgttctgga acattctgtc atcaccccgt gaagcaatca tttcccctcc 2400 
cgtcttcctc ctcccctggc aactgctgat cgactttgtg tctctgttgt ctaaaatagg 2460 
ttttccctgt tctggacatt tcatataaat ggaatcacac 2500 

<210> 15 

<211> 2068 

<212> DNA 

<213> Homo sapiens 

<400> 15 

cggcagccct cctacctgcg cacgtggtgc cgctgctgct gcctcccgct cgccctgaac 60 
ccagtgcctg cagccatggc tcccggccag ctcgccttat ttagtgtctc tgacaaaacc 12 0 
ggccttgtgg aatttgcaag aaacctgacc gctcttggtt tgaatctggt cgcttccgga 180 
gggactgcaa aagctctcag ggatgctggt ctggcagtca gagatgtctc tgagttgacg 240 
ggatttcctg aaatgttggg gggacgtgtg aaaactttgc atcctgcagt ccatgctgga 300 
atcctagctc gtaatattcc agaagataat gctgacatgg ccagacttga tttcaatctt 360 
ataagagttg ttgcctgcaa tctctatccc tttgtaaaga cagtggcttc tccaggtgta 42 0 
actgttgagg aggctgtgga gcaaattgac attggtggag taaccttact gagagctgca 48 0 
gccaaaaacc acgctcgagt gacagtggtg tgtgaaccag aggactatgt ggtggtgtcc 540 
acggagatgc agagctccga gagtaaggac acctccttgg agactagacg ccagttagcc 60 0 
ttgaaggcat tcactcatac ggcacaatat gatgaagcaa tttcagatta tttcaggaaa 66 0 
cagtacagca aaggcgtatc tcagatgccc ttgagatatg gaatgaaccc acatcagacc 72 0 
cctgcccagc tgtacacact gcagcccaag cttcccatca cagttctaaa tggagcccct 78 0 
ggatttataa acttgtgcga tgctttgaac gcctggcagc tggtgaagga actcaaggag 840 
gctttaggta ttccagccgc tgcctctttc aaacatgtca gcccagcagg tgctgctgtt 90 0 
ggaattccac tcagtgaaga tgaggccaaa gtctgcatgg tttatgatct ctataaaacc 96 0 
ctcacaccca tctcagcggc atatgcaaga gcaagagggg ctgataggat gtcttcattt 102 0 
ggtgattttg ttgcattgtc cgatgtttgt gatgtaccaa ctgcaaaaat tatttccaga 10 80 
gaagtatctg atggtataat tgccccagga tatgaagaag aagccttgac aatactttcc 1140 



aaaaagaaaa atggaaacta ttgtgtcctt cagatggacc aatcttacaa accagatgaa 120 0 
aatgaagttc gaactctctt tggtcttcat ttaagccaga agagaaataa tggtgtcgtc 12 6 0 
gacaagtcat tatttagcaa tgttgttacc aaaaataaag atttgccaga gtctgccctc 1320 
cgagacctca tcgtagccac cattgctgtc aagtacactc agtctaactc tgtgtgctac 1380 
gccaagaacg ggcaggttat cggcattgga gcaggacagc agtctcgtat acactgcact 1440 
cgccttgcag gagataaggc aaactattgg tggcttagac accatccaca agtgctttcg 1500 
atgaagttta aaacaggagt gaagagagca gaaatctcca atgccatcga tcaatatgtg 1560 
actggaacca ttggcgagga tgaagatttg ataaagtgga aggcactgtt tgaggaagtc 162 0 
cctgagttac tcactgaggc agagaagaag gaatgggttg agaaactgac tgaagtttct 1680 
atcagctctg atgccttctt ccctttccga gataacgtag acagagctaa aaggagtggt 1740 
gtggcgtaca ttgcggctcc ctccggttct gctgctgaca aagttgtgat tgaggcctgc 1800 
gacgaactgg gaatcatcct cgctcatacg aaccttcggc tcttccacca ctgattttac 1860 
cacacactgt tttttggctt gcttatgtgt aggtgaacag tcacgcctga aactttgagg 192 0 
ataacttttt aaaaaaataa aacagtatct cttaaaacaa tgttttgatc tacataaaca 1980 
ttgtaaaaat tttcaatcac gctttttaac tttcttacca caaaaaaatg ataagtgggt 2040 
gaagtgatgg ttatgttaat tagcgtgc 20 68 

<210> 16 

<211> 857 

<212> DNA 

<213> Homo sapiens 



<400> 16 

gcgtgggcgt gagatggcgg cggcagcggt 
gctgaagcag cgtctgcggg cgatgagtgc 
gagccagaag gtgattgccc acagtgagta 
gagcatgcaa gatgaaattg agacagaaga 
aatctgcttc atccctcggt accggttcca 
atcaccagag gaaatttctt tacttcccaa 
gggtgatgtt cgggaggagg ccttgtccac 
ccttgggttt gacaaacatg gcaaccgact 
tctgaagcgc tgtttgcagc atcaggaagt 
agaacagatt tgcctccagg tcccagtgaa 
ttacgaagac tcgtcaacag cttaaatctg 
tatgagagta aagcaaagta tgtgtatttt 
attaatgtga atacagactg cattttaaaa 
atctttaaaa accaatagaa gtgtgaatag 
cctgtgattt tcagctt 



gagcagcgcc aagcggagcc tgcggggaga 50 
cgaggagcgg ctacgccagt cccgcgtact 12 0 
tcaaaagtcc aaaagaattt ccatctttct 180 
gatcatcaag gacattttcc aacgaggcaa 240 
gagcaatcac atggatatgg tgagaataga 3 00 
aacatcctgg aatatccctc agcctggtga 360 
agggggactt gatctcatct tcatgccagg 42 0 
ggggaggggc aagggctact atgatgccta 48 0 
gaagccctac accctggcgt tggctttcaa 540 
tgaaaacgac atgaaggtag atgaagtcct 60 0 
gattactaca gccaaataat cagtgtttta 66 0 
tcccttgtca aaaattagtt gaaattgttc 72 0 
ttgtaattat gaaatacctt atataaaacc 78 0 
tagaatatta attaaaatgg aggctatcag 840 
857 



<210> 17 

<211> 3762 

<212> DNA 

<213> Homo sapiens 



<400> 17 

cccgcgagcg tccatccatc tgtccggccg 
cgcacgtcga cccgggggac cgaggccagg 
gggccggggc aggggacggt ggccgcggcc 
cgcctgggcc gcggggtcgg gggcggccgc 
gcgccgcgga gccgggacag cagcagtggc 
cgcctcctgg agcgccttct gcccagacac 
ggggacaaag accagagaga gatgctgcag 
atcgagaaga cggtccctgc caacatccgt 
gtttgtgaaa atgaaatcct tgcaactctg 
agatcgtata ttggcatggg ctattataac 
ttactggaga actcaggatg gatcacccag 
gggaggctgg agagtttact caactaccag 



actgtccagc gaaaggggct ccaggccggg 6 0 
agaggggcca agagcgcggc tgacccttgc 12 0 
atgcagtcct gtgccagggc gtgggggctg 180 
cgcctggctg ggggatcggg gccgtgctgg 240 
ggcggggaca gcgccgcggc tggggcctcg 3 00 
gacgacttcg ctcggaggca catcggccct 3 60 
accttggggc tggcgagcat tgatgaattg 42 0 
ttgaaaagac ccttgaaaat ggaagaccct 48 0 
catgccattt caagcaaaaa ccagatctgg 540 
tgctcagtgc cacagacgat tttgcggaac 600 
tatactccat accagcctga ggtgtctcag 660 
accatggtgt gtgacatcac aggcctggac 72 0 



atggccaatg catccctgct ggatgagggg actgcagccg cagaggcact gcagctgtgc 7 80 
tacagacaca acaagaggag gaaatttctc gttgatcccc gttgccaccc acagacaata 840 
gctgttgtcc agactcgagc caaatatact ggagtcctca ctgagctgaa gttaccctgt 900 
gaaatggact tcagtggaaa agatgtcagt ggagtgttgt tccagtaccc agacacggag 960 
gggaaggtgg aagactttac ggaactcgtg gagagagctc atcagagtgg gagcctggcc 102 0 
tgctgtgcta ctgacctttt agctttgtgc atcttgaggc cacctggaga atttggggta 1080 
gacatcgccc tgggcagctc ccagagattt ggagtgccac tgggctatgg gggaccccat 1140 
gcagcatttt ttgctgtccg agaaagcttg gtgagaatga tgcctggaag aatggtgggg 1200 
gtaacaagag atgccactgg gaaagaagtg tatcgtcttg ctcttcaaac cagggagcaa 1260 
cacattcgga gagacaaggc taccagcaac atctgtacag ctcaggccct cttggcgaat 132 0 
atggctgcca tgtttcgaat ctaccatggt tcccatgggc tggagcatat tgctaggagg 13 8 0 
gtacataatg ccactttgat tttgtcagaa ggtctcaagc gagcagggca tcaactccag 1440 
catgacctgt tctttgatac cttgaagatt cattgtggct gctcagtgaa ggaggtcttg 150 0 
ggcagggcgg ctcagcggca gatcaatttt cggctttttg aggatggcac acttggtatt 1560 
tctcttgatg aaacagtcaa tgaaaaagat ctggacgatt tgttgtggat ctttggttgt 162 0 
gagtcatctg cagaactggt tgctgaaagc atgggagagg agtgcagagg tattccaggg 1680 
tctgtgttca agaggaccag cccgttcctc acccatcaag tgttcaacag ctaccactct 1740 
gaaacaaaca ttgtccggta catgaagaaa ctggaaaata aagacatttc ccttgttcac 1800 
agcatgattc cactgggatc ctgcaccatg aaactgaaca gttcgtctga actcgcacct 1860 
atcacatgga aagaatttgc aaacatccac ccctttgtgc ctctggatca agctcaagga 192 0 
tatcagcagc ttttccgaga gcttgagaag gatttgtgtg aactcacagg ttatgaccag 1980 
gtctgtttcc agccaaacag cggagcccag ggagaatatg ctggactggc cactatccga 2040 
gcctacttaa accagaaagg agaggggcac agaacggttt gcctcattcc gaaatcagca 210 0 
catgggacca acccagcaag tgcccacatg gcaggcatga agattcagcc tgtggaggtg 2160 
gataaatatg ggaatatcga tgcagttcac ctcaaggcca tggtggataa gcacaaggag 2220 
aacctagcag ctatcatgat tacataccca tccaccaatg gggtgtttga agagaacatc 2280 
agtgacgtgt gtgacctcat ccatcaacat ggaggacagg tctacctaga cggggcaaat 2340 
atgaatgctc aggtgggaat ctgtcgccct ggagacttcg ggtctgatgt ctcgcaccta 2400 
aatcttcaca agaccttctg cattccccac ggaggaggtg gtcctggcat ggggcccatc 2460 
ggagtgaaga aacatctcgc cccgtttttg cccaatcatc ccgtcatttc actaaagcgg 252 0 
aatgaggatg cctgtcctgt gggaaccgtc agtgcggccc catggggctc cagttccatc 2580 
ttgcccattt cctgggctta tatcaagatg atgggaggca agggtcttaa acaagccacg 2640 
gaaactgcga tattaaatgc caactacatg gccaagcgat tagaaacaca ctacagaatt 2700 
cttttcaggg gtgcaagagg ttatgtgggt catgaattta ttttggacac gagacccttc 2760 
aaaaagtctg caaatattga ggctgtggat gtggccaaga gactccagga ttatggattt 2820 
cacgccccta ccatgtcctg gcctgtggca gggaccctca tggtggagcc cactgagtcg 2880 
gaggacaagg cagagctgga cagattctgt gatgccatga tcagcattcg gcaggaaatt 2940 
gctgacattg aggagggccg catcgacccc agggtcaatc cgctgaagat gtctccacac 300 0 
tccctgacct gcgttacatc ttcccactgg gaccggcctt attccagaga ggtggcagca 3060 
ttcccactcc ccttcatgaa accagagaac aaattctggc caacgattgc ccggattgat 3120 
gacatatatg gagatcagca cctggtttgt acctgcccac ccatggaagt ttatgagtct 3180 
ccattttctg aacaaaagag ggcgtcttct tagtcctctc tccctaagtt taaaggactg 3240 
atttgatgcc tctccccaga gcatttgata agcaagaaag atttcatctc ccaccccagc 3300 
ctcaagtagg agttttatat actgtgtata tctctgtaat ctctgtcaag gtaaatgtaa 3360 
atacagtagc tggagggagt cgaagctgat ggttggaaga cggatttgct ttggtattct 3420 
gcttccacat gtgccagttg cctggattgg gagccatttt gtgttttgcg tagaaagttt 3480 
taggaacttt aacttttaat gtggcaagtt tgcagatgtc atagaggcta tcctggagac 3 540 
ttaatagaca tttttttgtt ccaaaagagt ccatgtggac tgtgccatct gtgggaaatc 3 6 00 
ccagggcaaa tgtttacatt ttgtataccc tgaagaactc tttttcctct aatatgccta 3 6 60 
atctgtaatc acatttctga gtgttttcct ctttttctgt gtgaggtttt tttttttttt 3720 
aatctgcatt tattagtatt ctaataaaag cattttgatc gg 3762 

<210> 18 

<211> 1192 

<212> DNA 

<213> Homo sapiens 



<400> 18 

ggctccctcc ggccgcgaac tgcccctccc 
cgtagcgccg cgacccccgc acccctgcga 
gggccctgct ctgcaccctg cgcgcggtcc 
cctggcagct gggggtgggc gccgtccgta 
tgcgtaaatt cacagagaaa cacgaatggg 
gaatcagcaa ttttgcacag gaagcgttgg 
ttgggacaaa attgaacaaa caagatgagt 
gtgaactata ttctccttta tcaggagaag 
atccaggact tgtaaacaaa tcttgttatg 
gtaacccttc agaactagat gaacttatga 
ctattgagga gtgaaaatgg aactcctaaa 
agttgtctta aattagtggt ggatagagac 
gggcaaaaaa aaactactgt taacactgct 
tgattataga taaatataat atgcgtcttt 
ctctagtgtt cagaattcat gaaattatcc 
attcaaagat aacattgtta ttcttaagcc 
tacctggatt tgggatgaaa tacttaatga 
aggttttgtt gcttgtacag tgtcagatga 
ctgcatttgc tggtgctatt tttatacagt 
atacttcttc gttaaaaaaa aaaaaaaaaa 



cgccccgcct cccggcgcgg gtggccgagg 60 
acatggcgct gcgagtggtg cggagcgtgc 120 
cgttacccgc cgcgccctgc ccgccgaggc 180 
cgctgcgcac tggacccgct ctgctctcgg 240 
taacaacaga aaatggcatt ggaacagtgg 3 00 
gagatgttgt ttattgtagt ctccctgaag 3 60 
ttggtgcttt ggaaagtgtg aaagctgcta 420 
taactgaaat taatgaagct cttgcagaaa 480 
aagatggttg gctgatcaag atgacactga 540 
gtgaagaagc atatgagaaa tacataaaat 6 00 
taaactagta tgaaataacg aagccagcag 6 60 
ttagaataga aacttttagt attaccgatg 720 
aatgaaagaa aatgcccttt aactttgtaa 780 
ttcacaatat cctatgattt ttagactagg 840 
atggtaaaaa ctagttataa aaattacata 900 
ttatataata ttgtaacttg catgtatcca 960 
tctttccatt ggaaataact ggaagtgaag 102 0 
ggaacaccac tatcttaatt ttgcgataca 1080 
gaagcaacag ctttgcagca aaataataaa 1140 
aaaaaaaaaa aaaaaaaaaa aa 1192 



<210> 19 

<211> 2102 

<212> DWA 

<213> Homo sapiens 



<400> 19 

tgcccacgcc cccttcagat cctttgctcc 
actacatctc ccggcgtgcc tggcagtgtg 
caggcgacga tgcagagggc tgtaagtgtg 
ttccccccgg ccttgtgtcg tccacttagt 
ctctatgact tccacctggc ccacggcggg 
ccagtgcagt accgggacag tcacactgac 
ctctttgacg tgtctcatat gctgcagacc 
atggagagtc tagtggttgg agacattgca 
ctgtttacca acgaggctgg aggcatctta 
ggccacctgt atgtggtgtc caacgctggc 
gacaaggtca gggagcttca gaaccagggc 
gccctgctag ctctgcaagg ccccactgca 
gacctgagga aactgccctt catgaccagt 
tgccgcgtga cccgctgtgg ctacacagga 
gcgggggcag ttcacctggc aacagctatt 
ctggcagcca gggacagcct gcgcctggag 
gatgaacaca ctacacctgt ggagggcagc 
gctgctatgg acttccctgg agccaaggtc 
cggaggcgtg tggggttgat gtgtgagggg 
aacatggagg gtaccaagat tggtactgtg 
aagaatgtgg cgatgggtta tgtgccctgc 
gtagaggtgc ggcggaagca gcagatggct 
aactactata ccctcaagtg aagctggctc 
cccctacaag gggttagtca agaagctgag 
tggaggctga ttctaattgt ctggttgagg 
catgccattc cagcttcctt caggaccctg 
tcttgtttca gtccatgatc ccactgacct 
tttggttctg ccatctctcc cactctgcca 
tgtggagagg ataaaacctg cccaacctac 



ggagagagac ctgtccgagc agaggcctgg 60 
gtggcctctg tgcgccgtct gcactcgttg 12 0 
gtggcccgtc tgggctttcg cctgcaggca 180 
tgcgcacagg aggtgctccg caggacaccg 240 
aaaatggtgg cgtttgcggg ttggagtctg 300 
tcgcacctgc acacacgcca gcactgctcg 360 
aagatacttg gtagtgaccg ggtgaagctg 42 0 
gagctaagac caaaccaggg gacactgtcg 48 0 
gatgacttga ttgtaaccaa tacttctgag 54 0 
tgctgggaga aagatttggc cctcatgcag 600 
agagatgtgg gcctggaggt gttggataat 660 
gcccaggtac tacaggccgg cgtggcagat 72 0 
gctgtgatgg aggtgtttgg cgtgtctggc 780 
gaggatggtg tggagatctc ggtgccggta 840 
ctgaaaaacc cagaggtgaa gctggcaggg 90 0 
gcaggcctct gcctgtatgg gaatgacatt 960 
ctcagttgga cactggggaa gcgccgccga 102 0 
attgttcccc agctgaaggg cagggtgcag 108 0 
gcccccatgc gggcacacag tcccatcctg 1140 
actagtggct gcccctcccc ctctctgaag 1200 
gagtacagtc gtccagggac aatgctgctg 12 60 
gtagtcagca agatgccctt tgtgcccaca 1320 
agggtggggc tgtcccttcc aggagttttg 1380 
gcagaactca ctgggggtgg gcagttaagg 1440 
ggccacacca cctattcccc ccacctaact 1500 
cttctgagtg acggaccagc tcacacaatg 1560 
actcttgcct gctggagggt aatgagaagc 1620 
ggtgctggct gtggagcaaa ggctcacctt 1680 
ctcaccatgg tttttcacat tgcaaagggt 1740 



aataacatgg gcagtgcgga cttaggctac cccctccagt ttgctttccg taaatgcaaa 1800 
ttgtccttac tgcaagtcag gaatgattgc tgactcacag tagggctgct atgcctgtgt 1860 
gtaaacttgg ggatggctga gggaacatag actcactctt ccacattccc aagttggtct 192 0 
agtgtgctgc ccagtagcaa accatggcag actcaccacc tattctgagt tccagggctg 1980 
ctgtagggca gggtgggctt cctcccagac ttgccttacc ctgggctgat ctttgcccct 2040 
ggtatgcatt aatggactcc actgaatcct gaaaaaaaaa ttaaacttcc ttcttacttg 2100 
cc 2102 

<210> 20 

<211> 3228 

<212> DNA 

<213> Homo sapiens 

<400> 20 

aaaaaactca ggcaaagtca cagcctcaaa attgttcact gaaagaacgc tgagtggaga 60 
agtgtgagaa gatgaatgga ccggtggatg gcttgtgtga ccactctcta agtgaaggag 12 0 
tcttcatgtt cacatcggag tctgtgggag agggacaccc ggataagatc tgtgaccaga 18 0 
tcagtgatgc agtgctggat gcccatctca agcaagaccc caatgccaag gtggcctgtg 240 
agacagtgtg caagaccggc atggtgctgc tgtgtggtga gatcacctca atggccatgg 300 
tggactacca gcgggtggtg agggacacca tcaagcacat cggctacgat gactcagcca 360 
agggctttga cttcaagact tgcaacgtgc tggtggcttt ggagcagcaa tccccagata 42 0 
ttgcccagtg cgtccatctg gacagaaatg aggaggatgt gggggcagga gatcagggtt 48 0 
tgatgttcgg ctatgctacc gacgagacag aggagtgcat gcccctcacc atcatccttg 540 
ctcacaagct caacgcccgg atggcagacc tcaggcgctc cggcctcctc ccctggctgc 60 0 
ggcctgactc taagactcag gtgacagttc agtacatgca ggacaatggc gcagtcatcc 66 0 
ctgtgcgcat ccacaccatc gtcatctctg tgcagcacaa cgaagacatc acgctggagg 72 0 
agatgcgcag ggccctgaag gagcaagtca tcagggccgt ggtgccggcc aagtacctgg 78 0 
acgaagacac cgtctaccac ctgcagccca gtgggcggtt tgtcatcgga ggtccccagg 840 
gggatgcggg tgtcactggc cgtaagatta ttgtggacac ctatggcggc tggggggctc 90 0 
atggtggtgg ggccttctct gggaaggact acaccaaggt agaccgctca gctgcatatg 960 
ctgcccgctg ggtggccaag tctctggtga aagcagggct ctgccggaga gtgcttgtcc 102 0 
aggtttccta tgccattggt gtggccgagc cgctgtccat ttccatcttc acctacggaa 1080 
cctctcagaa gacagagcga gagctgctgg atgtggtgca taagaacttc gacctccggc 1140 
cgggcgtcat tgtcagggat ttggacttga agaagcccat ctaccagaag acagcatgct 1200 
acggccattt cggaagaagc gagttcccat gggaggttcc caggaagctt gtattttaga 1260 
gccaggggga gctgggcctg gtctcaccct ggaggcacct ggtggccatg ctcctcttcc 1320 
ccagacgcct ggctgctgat cgccttcccc acccaccaac cctcagggca aagccaggtc 1380 
cctctcattt agcctgtcct gtcatcatca tggccagctg gaggcagggg cttcctggtg 1440 
ctggaggttg gatcttgatg taaggatggg catggtgttc tcctgctgct ccctcagact 1500 
ggggcaatgt taatttagtg gaaaaggcac ccccgtcaag agtgaattcc ctcactcgtc 15 60 
tcccccaaca gctggaccct gaccagctcc ccctccctcc ccttgcctgt gccaggtgag 1620 
gtcagcacat ctcaacaggc ctcagggctc cttgtgggcc tgggctcctg gacccccctt 1680 
tcacaggcag ccagtgccct gagccagggt ctccagaaag ccccacccag gccaggcatg 1740 
tggcaggggt tagagcagga ctgatgtctc ctaagcacct gtaatgtgcg agggacccag 1800 
ctaataactg atctcgtttt ttcttcactg caacatgatg aggtagtacc ttttatatcc 1860 
catttataga tgggggaaag caaagcacag agagtctgga taacttccac agggtcccac 192 0 
agccacgtgt ttagacctag atgtataact aggagctttg actcaggagc ctgtgacata 1980 
cccccttccc caccgttgtc tcatgccagt aacaggctca aacaatgaca aagcagattc 2040 
agaaatgagg ccatggactc tgtcctgaag gcctgaggtt actggaaatt aggggattaa 2100 
cccactagct cttgttgagc cgtgggcaat tgtctgaaaa gtgaagacag aaccacaggg 2160 
ctattttgtt tgcttcatgt gtcccagaag atgactgagg gtgagttggc ttacctggcc 222 0 
catcagggta ggctggagtt agggactgac cagcagcttt agaatcccag ccccctgacc 2280 
actcagagac atgcagagat tgggtttttg gacttctggg gtaagtggtc taagtccagt 2340 
ccagtcctat gtgggcttcc tggagcagaa gcagcaactt gtcctagcac agatggccag 2400 
ccccttagac agaggccctc aagtctttct ctttccctgg tcccttgtat cccctgcagg 2460 
ctgagtgcat ttggagggag tgagtggccc tttcggatcc agggaggctg gtcctatggc 252 0 
ctcatgttaa ataggcgggg cttgccttct ggtgttggac aagcttctga gacgtcatga 2580 



ggagattctg cctttgccag gtgactgtct ggggagcggg tctgctccca aggggcctga 2640 
gcagtccttg gcctgctaag gtcttggaac ttgcctgcct ttccatccat ggccagcagc 2700 
acctgcccta cctgccccac ttgtccttag cctggacctc tgacagcagc atctctacct 2760 
tctccccagc tcccaggacc acaggctcag gcagggcctc catgggcccc aggggaacac 2820 
tggggacttg gcctctctct agggtacatg gtgctgggag aggcagccca ggaagtctca 2880 
tctggggagc aggcagccag catctgggcc ttggcctgga gcacaaagac cctggctttc 294 0 
attttctctc aggtgaaagg aaattaaggc aacaaaagaa gcccggctcc tggtcaccta 3000 
ggaagcctca gattccttcc catggaggga gggagtggtt tgcaggtggc caagttcctc 3060 
taacttggct cacactcgac atgaaaattc agaattttat actttcccta ccctctagag 312 0 
aaataagatc ttttttgtca gtttgtttgt atgaaactaa agctttattt gttaatagtt 3180 
cctgctaaaa caatgaataa aaactcaagg agcaactaaa aaaaaaaa 32 2 8 

<210> 21 
<211> 344 
<212> PRT 

<213> Homo sapiens 
<400> 21 

Met Ser Ala Leu Ala Ala Arg Leu Leu Gin Pro Ala His Ser Cys Ser 
15 10 15 

Leu Arg Leu Arg Pro Phe His Leu Ala Ala Val Arg Asn Glu Ala Val 
20 25 30 

Val lie Ser Gly Arg Lys Leu Ala Gin Gin lie Lys Gin Glu Val Arg 
35 40 45 

Gin Glu Val Glu Glu Trp Val Ala Ser Gly Asn Lys Arg Pro His Leu 
50 55 60 

Ser Val lie Leu Val Gly Glu Asn Pro Ala Ser His Ser Tyr Val Leu 
65 70 75 80 

Asn Lys Thr Arg Ala Ala Ala Val Val Gly lie Asn Ser Glu Thr lie 
85 90 95 

Met Lys Pro Ala Ser lie Ser Glu Glu Glu Leu Leu Asn Leu lie Asn 
100 105 110 

Lys Leu Asn Asn Asp Asp Asn Val Asp Gly Leu Leu Val Gin Leu Pro 
115 120 125 

Leu Pro Glu His lie Asp Glu Arg Arg lie Cys Asn Ala Val Ser Pro 
130 135 140 

Asp Lys Asp Val Asp Gly Phe His Val lie Asn Val Gly Arg Met Cys 
145 150 155 160 

Leu Asp Gin Tyr Ser Met Leu Pro Ala Thr Pro Trp Gly Val Trp Glu 
165 170 175 

lie lie Lys Arg Thr Gly lie Pro Thr Leu Gly Lys Asn Val Val Val 
180 185 190 



Ala Gly Arg Ser Lys Asn Val Gly Met Pro lie Ala Met Leu Leu His 
195 200 205 



Thr Asp Gly Ala His Glu Arg Pro Gly Gly Asp Ala Thr Val Thr lie 
210 215 220 



Ser His Arg Tyr Thr Pro Lys Glu Gin Leu Lys Lys His Thr lie Leu 
225 230 235 240 

Ala Asp lie Val lie Ser Ala Ala Gly lie Pro Asn Leu lie Thr Ala 
245 250 255 

Asp Met lie Lys Glu Gly Ala Ala Val lie Asp Val Gly He Asn Arg 
260 265 270 

Val His Asp Pro Val Thr Ala Lys Pro Lys Leu Val Gly Asp Val Asp 
275 280 285 

Phe Glu Gly Val Arg Gin Lys Ala Gly Tyr He Thr Pro Val Pro Gly 
290 295 300 

Gly Val Gly Pro Met Thr Val Ala Met Leu Met Lys Asn Thr He He 
305 310 315 320 

Ala Ala Lys Lys Val Leu Arg Leu Glu Glu Arg Glu Val Leu Lys Ser 
325 330 335 

Lys Glu Leu Gly Val Ala Thr Asn 
340 



<210> 22 

<211> 1283 

<212> DNA 

<213> Homo sapiens 

<400> 22 

tttcgcagcc gctgccgcct cgccgctgct ccttcgtaag gccacttccg cacaccgaca 60 
ccaacatgaa cggacagctc aacggcttcc acgaggcgtt catcgaggag ggcacattcc 12 0 
ttttcacctc agagtcggtc ggggaaggcc acccagataa gatttgtgac caaatcagtg 180 
atgctgtcct tgatgcccac cttcagcagg atcctgatgc caaagtagct tgtgaaactg 240 
ttgctaaaac tggaatgatc cttcttgctg gggaaattac atccagagct gctgttgact 300 
accagaaagt ggttcgtgaa gctgttaaac acattggata tgatgattct tccaaaggtt 360 
ttgactacaa gacttgtaac gtgctggtag ccttggagca acagtcacca gatattgctc 42 0 
aaggtgttca tcttgacaga aatgaagaag acattggtgc tggagaccag ggcttaatgt 48 0 
ttggctatgc cactgatgaa actgaggagt gtatgccttt aaccattgtc ttggcacaca 540 
agctaaatgc caaactggca gaactacgcc gtaatggcac tttgccttgg ttacgccctg 60 0 
attctaaaac tcaagttact gtgcagtata tgcaggatcg aggtgctgtg cttcccatca 66 0 
gagtccacac aattgttata tctgttcagc atgatgaaga ggtttgtctt gatgaaatga 720 
gggatgccct aaaggagaaa gtcatcaaag cagttgtgcc tgcgaaatac cttgatgagg 780 
atacaatcta ccacctacag ccaagtggca gatttgttat tggtgggcct cagggtgatg 840 
ctggtttgac tggacggaaa atcattgtgg acacttatgg cggttggggt gctcatggag 900 
gaggtgcctt ttcaggaaag gattatacca aggtcgaccg ttcagctgct tatgctgctc 9 60 
gttgggtggc aaaatccctt gttaaaggag gtctgtgccg gagggttctt gttcaggtct 1020 
cttatgctat tggagtttct catccattat ctatctccat tttccattat ggtacctctc 1080 
agaagagtga gagagagcta ttagagattg tgaagaagaa tttcgatctc cgccctgggg 1140 
tcattgtcag ggatctggat ctgaagaagc caatttatca gaggactgca gcctatggcc 12 00 
actttggtag ggacagcttc ccatgggaag tgcccaaaaa gcttaaatat tgaaagtgtt 12 50 
agcctttttt ccccagactt gtt 1283 



<210> 23 

<211> 3259 

<212> DNA 

<213> Homo sapiens 

<400> 23 

caaggttggt ggaagtcgcg ttgtgcaggt tcgtgcccgg ctggcgcggc gtggtttcac 60 
tgttacatgc cttgaagtga tgaggaggtt tctgttacta tatgctacac agcagggaca 120 
ggcaaaggcc atcgcagaag aaatgtgtga gcaagctgtg gtacatggat tttctgcaga 180 
tcttcactgt attagtgaat ccgataagta tgacctaaaa accgaaacag ctcctcttgt 240 
tgttgtggtt tctaccacgg gcaccggaga cccacccgac acagcccgca agtttgttaa 300 
ggaaatacag aaccaaacac tgccggttga tttctttgct cacctgcggt atgggttact 360 
gggtctcggt gattcagaat acacctactt ttgcaatggg gggaagataa ttgataaacg 420 
acttcaagag cttggagccc ggcatttcta tgacactgga catgcagatg actgtgtagg 480 
tttagaactt gtggttgagc cgtggattgc tggactctgg ccagccctca gaaagcattt 540 
taggtcaagc agaggacaag aggagataag tggcgcactc ccggtggcat cacctgcatc 600 
cttgaggaca gaccttgtga agtcagagct gctacacatt gaatctcaag tcgagcttct 660 
gagattcgat gattcaggaa gaaaggattc tgaggttttg aagcaaaatg cagtgaacag 72 0 
caaccaatcc aatgttgtaa ttgaagactt tgagtcctca cttacccgtt cggtaccccc 780 
actctcacaa gcctctctga atattcctgg tttaccccca gaatatttac aggtacatct 840 
gcaggagtct cttggccagg aggaaagcca agtatctgtg acttcagcag atccagtttt 900 
tcaagtgcca atttcaaagg cagttcaact tactacgaat gatgccataa aaaccactct 960 
gctggtagaa ttggacattt caaatacaga cttttcctat cagcctggag atgccttcag 102 0 
cgtgatctgc cctaacagtg attctgaggt acaaagccta ctccaaagac tgcagcttga 1080 
agataaaaga gagcactgcg tccttttgaa aataaaggca gacacaaaga agaaaggagc 114 0 
taccttaccc cagcatatac ctgcgggatg ttctctccag ttcattttta cctggtgtct 1200 
tgaaatccga gcaattccta aaaaggcatt tttgcgagcc cttgtggact ataccagtga 1260 
cagtgctgaa aagcgcaggc tacaggagct gtgcagtaaa caaggggcag ccgattatag 132 0 
ccgctttgta cgagatgcct gtgcctgctt gttggatctc ctcctcgctt tcccttcttg 1380 
ccagccacca ctcagtctcc tgctcgaaca tcttcctaaa cttcaaccca gaccatattc 1440 
gtgtgcaagc tcaagtttat ttcacccagg aaagctccat tttgtcttca acattgtgga 1500 
atttctgtct actgccacaa cagaggttct gcggaaggga gtatgtacag gctggctggc 1560 
cttgttggtt gcttcagttc ttcagccaaa catacatgca tcccatgaag acagcgggaa 162 0 
agccctggct cctaagatat ccatctctcc tcgaacaaca aattctttcc acttaccaga 1680 
tgacccctca atccccatca taatggtggg tccaggaacc ggcatagccc cgtttattgg 1740 
gttcctacaa catagagaga aactccaaga acaacaccca gatggaaatt ttggagcaat 18 0 0 
gtggttgttt tttggctgca ggcataagga tagggattat ctattcagaa aagagctcag 18 60 
acatttcctt aagcatggga tcttaactca tctaaaggtt tccttctcaa gagatgctcc 1920 
tgttggggag gaggaagccc cagcaaagta tgtacaagac aacatccagc ttcatggcca 1980 
gcaggtggcg agaatcctcc tccaggagaa cggccatatt tatgtgtgtg gagatgcaaa 2040 
gaatatggcc aaggatgtac atgatgccct tgtgcaaata ataagcaaag aggttggagt 2100 
tgaaaaacta gaagcaatga aaaccctggc cactttaaaa gaagaaaaac gctaccttca 2160 
ggatatttgg tcataaaacc agaaattaaa gaaagaggat taagcttttt tgactgaaag 2220 
tactaaaagt cagctttact agtgccaaac ctttaaattt tcaaaagaaa attttctttc 2280 
aacatttctt gaaggacatg gagtggagat tggatcattt aacaatataa caaaacttcc 2340 
tgatttgatt ttacgtatct tctatctacg cccttcctgt gcctgtgact ctccccaaat 2400 
tgccctgttg ccttgagctc ttctgagcta aaggcagcct tcagtcccta tcagcgcctc 2460 
ctttacttcc cagagaactt cacagagact ctgtccttcc atgcaaaggc ttcctgaaat 2520 
aggggagact gactgagtag ctcattcttg tgacttacag tgccaacatt taaaaaagta 2580 
tgaaaatgat ttatttttat atgatgtata cccataaaga atgctcatat taatgtactt 2640 
aaattacaca tgtagagcat atctgttata tgtttatgta actatcaaat ggttatttgt 2700 
tactaaagct atatttctga taaaaaatat tttaggataa ttgcctacag agggatttat 27 6 0 
ttttatgatg ctgggaaata tgaaatgtat tttaaaattt cactctgggc atatggattt 2820 
atctatcacc attacttttt tttaagtcac aatttcagaa ttttgggaca tttgcattca 2880 
atttacaggt accagtacgt acatatttta atagaaagat acaacctttt tattttcact 2940 
ccttttattt ctgctgcttg gcacattttt gagttttccc acattatttg tctccatgat 3000 
accactcaag cagtgtgctg gacctaaaat actgacttta gttagtatcc ttggattttt 3060 



agattcccca gtgtctaatt ccctgttata atttgcacaa acaaaacaaa atgttatgat 312 0 
aatctttctc cactgttcta atatatattg tatttttatt tgatagcttg ggatttaaaa 3180 
catctctgtt gaaggctttt gatccttttg agaaataaag atctgaaaga aatggcataa 3240 
tcttaaaaaa aaaaaaaaa 3 25 9 

<210> 24 

<211> 1805 

<212> DNA 

<213> Homo sapiens 

<400> 24 

aagagactga actgtatctg cctctatttc caaaagactc acgttcaact ttcgctcaca 60 
caaagccggg aaaattttat tagtcctttt tttaaaaaaa gttaatataa aattatagca 12 0 
aaaaaaaaaa ggaacctgaa ctttagtaac acagctggaa caatcgcagc ggcggcggca 180 
gcggcgggag aagaggttta atttagttga ttttctgtgg ttgttggttg ttcgctagtc 240 
tcacggtgat ggaagctgca cattttttcg aagggaccga gaagctgctg gaggtttggt 3 00 
tctcccggca gcagcccgac gcaaaccaag gatctgggga tcttcgcact atcccaagat 3 50 
ctgagtggga catacttttg aaggatgtgc aatgttcaat cataagtgtg acaaaaactg 42 0 
acaagcagga agcttatgta ctcagtgaga gtagcatgtt tgtctccaag agacgtttca 480 
ttttgaagac atgtggtacc accctcttgc tgaaagcact ggttcccctg ttgaagcttg 540 
ctagggatta cagtgggttt gactcaattc aaagcttctt ttattctcgt aagaatttca 600 
tgaagccttc tcaccaaggg tacccacacc ggaatttcca ggaagaaata gagtttctta 660 
atgcaatttt cccaaatgga gcaggatatt gtatgggacg tatgaattct gactgttggt 720 
acttatatac tctggatttc ccagagagtc gggtaatcag tcagccagat caaaccttgg 7 80 
aaattctgat gagtgagctt gacccagcag ttatggacca gttctacatg aaagatggtg 840 
ttactgcaaa ggatgtcact cgtgagagtg gaattcgtga cctgatacca ggttctgtca 9 00 
ttgatgccac aatgttcaat ccttgtgggt attcgatgaa tggaatgaaa tcggatggaa 9 60 
cttattggac tattcacatc actccagaac cagaattttc ttatgttagc tttgaaacaa 1020 
acttaagtca gacctcctat gatgacctga tcaggaaagt tgtagaagtc ttcaagccag 1080 
gaaaatttgt gaccaccttg tttgttaatc agagttctaa atgtcgcaca gtgcttgctt 1140 
cgccccagaa gattgaaggt tttaagcgtc ttgattgcca gagtgctatg ttcaatgatt 1200 
acaattttgt ttttaccagt tttgctaaga agcagcaaca acagcagagt tgattaagaa 1260 
aaatgaagaa aaaacgcaaa aagagaacac atgtagaagg tggtggatgc tttctagatg 132 0 
tcgatgctgg gggcagtgct ttccataacc accactgtgt agttgcagaa agccctagat 1380 
gtaatgatag tgtaatcatt ttgaattgta tgcattatta tatcaaggag ttagatatct 1440 
tgcatgaatg ctctcttctg tgtttaggta ttctctgcca ctcttgctgt gaaattgaag 1500 
tggatgtaga aaaaaccttt tactatatga aactttacaa cacttgtgaa agcaactcaa 1560 
tttggtttat gcacagtgta atatttctcc aagtatcatc caaaattccc cacagacaag 162 0 
gctttcgtcc tcattaggtg ttggcctcag cctaaccctc taggactgtt ctattaaatt 1680 
gctgccagaa ttttacatcc agttacctcc actttctaga acatattctt tactaatgtt 1740 
attgaaacca atttctactt catactgatg tttttggaaa cagcaattaa agtttttctt 1800 
ccatg 1805 

<210> 25 
<211> 254 
<212> PRT 

<213> Homo sapiens 
<400> 25 

Gin Asp lie Leu Val Phe Arg Ser Lys Thr Tyr Gly Asn Val Leu Val 
15 10 15 

Leu Asp Gly Val lie Gin Cys Thr Glu Arg Asp Glu Phe Ser Tyr Gin 



Glu Met lie Ala Asn Leu Pro Leu Cys Ser His Pro Asn Pro Arg Lys 
35 40 45 



Val Leu lie lie Gly Gly Gly Asp Gly Gly Val Leu Arg Glu Val Val 
50 55 60 



Lys His Pro Ser Val Glu Ser Val Val Gin Cys Glu lie Asp Glu Asp 
65 70 75 80 

Val lie Gin Val Ser Lys Lys Phe Leu Pro Gly Met Ala lie Gly Tyr 
85 90 95 

Ser Ser Ser Lys Leu Thr Leu His Val Gly Asp Gly Phe Glu Phe Met 
100 105 110 

Lys Gin Asn Gin Asp Ala Phe Asp Val lie lie Thr Asp Ser Ser Asp 
115 120 125 

Pro Met Gly Pro Ala Glu Ser Leu Phe Lys Glu Ser Tyr Tyr Gin Leu 
130 135 140 

Met Lys Thr Ala Leu Lys Glu Asp Gly Val Leu Cys Cys Gin Gly Glu 
145 150 155 150 

Cys Gin Trp Leu His Leu Asp Leu lie Lys Glu Met Arg Gin Phe Cys 
165 170 175 

Gin Ser Leu Phe Pro Val Val Ala Tyr Ala Tyr Cys Thr lie Pro Thr 
180 185 190 

Tyr Pro Ser Gly Gin lie Gly Phe Met Leu Cys Ser Lys Asn Pro Ser 
195 200 205 

Thr Asn Phe Gin Glu Pro Val Gin Pro Leu Thr Gin Gin Gin Val Ala 
210 215 220 

Gin Met Gin Leu Lys Tyr Tyr Asn Ser Asp Val His Arg Ala Ala Phe 
225 230 235 240 

Val Leu Pro Glu Phe Ala Arg Lys Ala Leu Asn Asp Val Ser 
245 250 



<210> 26 

<211> 2211 

<212> DNA 

<213> Homo sapiens 

<400> 26 

ctgaggccca gcccccttcg cccgtttcca tcacgagtgc cgccagcatg tctgacaaac 60 

tgccctacaa agtcgccgac atcggcctgg ctgcctgggg acgcaaggcc ctggacattg 12 0 

ctgagaacga gatgccgggc ctgatgcgta tgcgggagcg gtactcggcc tccaagccac 180 

tgaagggcgc ccgcatcgct ggctgcctgc acatgaccgt ggagacggcc gtcctcattg 240 

agaccctcgt caccctgggt gctgaggtgc agtggtccag ctgcaacatc ttctccaccc 300 

agaaccatgc ggcggctgcc attgccaagg ctggcattcc ggtgtatgcc tggaagggcg 360 

aaacggacga ggagtacctg tggtgcattg agcagaccct gtacttcaag gacgggcccc 42 0 

tcaacatgat tctggacgac gggggcgacc tcaccaacct catccacacc aagtacccgc 480 

agcttctgcc aggcatccga ggcatctctg aggagaccac gactggggtc cacaacctct 540 

acaagatgat ggccaatggg atcctcaagg tgcctgccat caatgtcaat gactccgtca 600 



ccaagagcaa gtttgacaac ctctatggct gccgggagtc cctcatagat ggcatcaagc 660 

gggccacaga tgtgatgatt gccggcaagg tagcggtggt agcaggctat ggtgatgtgg 720 

gcaagggctg tgcccaggcc ctgcggggtt tcggagcccg cgtcatcatc accgagattg 780 

accccatcaa cgcactgcag gctgccatgg agggctatga ggtgaccacc atggatgagg 840 

cctgtcagga gggcaacatc tttgtcacca ccacaggctg tattgacatc atccttggcc 9 00 

ggtaggtgcc agatgggggg tcccggggag tgagggagga gggcagagtt gggacagctt 960 

tctgtccctg acaatctccc acggtcttgg gctgcctgac aggcactttg agcagatgaa 1020 

ggatgatgcc attgtgtgta acattggaca ctttgacgtg gagatcgatg tcaagtggct 1080 

caacgagaac gccgtggaga aggtgaacat caagccgcag gtggaccggt atcggttgaa 114 0 

gaatgggcgc cgcatcatcc tgctggccga gggtcggctg gtcaacctgg gttgtgccat 1200 

gggccacccc agcttcgtga tgagtaactc cttcaccaac caggtgatgg cgcagatcga 1260 

gctgtggacc catccagaca agtaccccgt tggggttcat ttcctgccca agaagctgga 132 0 

tgaggcagtg gctgaagccc acctgggcaa gctgaatgtg aagttgacca agctaactga 1380 

gaagcaagcc cagtacctgg gcatgtcctg tgatggcccc ttcaagccgg atcactaccg 1440 

ctactgagag ccaggtctgc gtttcaccct ccagctgctg tccttgccca ggccccacct 1500 

ctcctcccta agagctaatg gcaccaactt tgtgattggt ttgtcagtgt cccccatcga 1560 

ctctctgggg ctgatcactt agtttttggc ctctgctgca gccgtcatac tgttccaaat 162 0 

gtggcagcgg gaacagagta ccctcttcaa gccccggtca tgatggaggt cccagccaca 1680 

gggaaccatg agctcagtgg tcttggaaca gctcactaag tcagtccttc cttagcctgg 1740 

aagtcagtag tggagtcaca aagcccatgt gttttgccat ctaggccttc acctggtctg 180 0 

tggacttata cctgtgtgct tggtttacag gtccagtggt tcttcagccc atgacagatg 1860 

agaaggggct atattgaagg gcaaagagga actgttgttt gaattttcct gagagcctgg 192 0 

cttagtgctg ggccttctct taaacctcat tacaatgagg ttagtacttt tagtccctgt 1980 

tttacagggg ttagaataga ctgttaaggg gcaactgaga aagaacagag aagtgacagc 2040 

taggggttga gaggggccag aaaaacatga atgcaggcag atttcgtgaa atctgccacc 2100 

actttataac cagatggttc ctttcacaac cctgggtcaa aaagagaata atttggccta 2160 

taatgttaaa agaaagcagg aaggtgggta aataaaaatc ttggtgcctg g 2211 

<210> 27 

<211> 2436 

<212> DNA 

<213> Homo sapiens 

<400> 27 

cgaccacctg tctggacacc acaaagatgc cacccgttgg gggcaaaaag gccaagaagg 60 
gcatcctaga acgtttaaat gctggagaga ttgtgattgg agatggaggg tttgtctttg 12 0 
cactggagaa gaggggctac gtaaaggcag gaccctggac tcctgaagct gctgtggagc 18 0 
acccagaagc agttcgccag cttcatcgag agttcctcag agctggctca aacgtcatgc 240 
agaccttcac cttctatgcg agtgaagaca agctggagaa caggggcaac tatgtcttag 3 00 
agaagatatc tgggcaggaa gtcaatgaag ctgcttgcga catcgcccga caagtggctg 360 
atgaaggaga tgctttggta gcaggaggag tgagtcagac accttcatac cttagctgca 42 0 
agagtgaaac tgaagtcaaa aaagtatttc tgcaacagtt agaggtcttt atgaagaaga 480 
acgtggactt cttgattgca gagtattttg aacacgttga agaagctgtg tgggcagttg 540 
aaaccttgat agcatccggt aaacctgtgg cagcaaccat gtgcattggc ccagaaggag 600 
atttgcatgg cgtgcccccc ggcgagtgtg cagtgcgcct ggtgaaagca ggagcatcca 660 
tcattggtgt gaactgccac tttgacccca ccattagttt aaaaacagtg aagctcatga 72 0 
aggagggctt ggaggctgcc caactgaaag ctcacctgat gagccagccc ttggcttacc 780 
acactcctga ctgcaacaag cagggattca tcgatctccc agaattccca tttggactgg 840 
aacccagagt tgccaccaga tgggatattc aaaaatacgc cagagaggcc tacaacctgg 90 0 
gggtcaggta cattggcggg tgctgtggat ttgagcccta ccacatcagg gcaattgcag 96 0 
aggagctggc cccagaaagg ggctttttgc caccagcttc agaaaaacat ggcagctggg 102 0 
gaagtggttt ggacatgcac accaaaccct gggttagagc aagggccagg aaggaatact 1080 
gggagaatct tcggatagcc tcaggccggc catacaaccc ttcaatgtca aagccagatg 1140 
gctggggagt gaccaaagga acagccgagc tgatgcagca gaaagaagcc acaactgagc 12 00 
agcagctgaa agagctcttt gaaaaacaaa aattcaaatc acagtagcct cgatagaagc 12 60 
tatttttgat gaatttctag gtgtttgggt cacagttcct acaaatacgg aaaagggggt 1320 
taaaaagcag tgctttcatg aatgccatcc tacacatatt attgctatta cctgaacaaa 13 80 



atagaattac aaatagcact tgataatttt 
aaaataagta caaagtaaat cttgaacagg 
tatggaaatc actgcagcac aggaaaagta 
ggtaggctag caaagaggat gagacatgaa 
gacagataaa gcgctatgga aaggggcttc 
aaacacaatt tatataatga cccagcaaaa 
tgtgatccat cctagtattt tctgttccat 
tgctagttga gacttttcaa atggattttt 
ttgaaaaata ttgctacaag acacttaagg 
gtaggtcagt catatgagac ctgatcaata 
agttcttctg tttcgtgacc cacttttcta 
ggggaggaga gatggatatt tcagccctct 
gcctctgggc cataaggctg agcagagtga 
ataataaaaa aaaagtagag attctccaaa 
tgatattttt ctctgataat ttaatatcta 
caagactatt agaagaaaca tgactaccct 
tctaaacatt attgaattgt ttgagctgtt 
aagaagagca atgagaaaaa aaaaaaaaaa 

<210> 28 

<211> 1326 

<212> DNA 

<213> Homo sapiens 



aaagtatgtt ttagaaattt tcttaggagc 1440 
ttcactaagc acccaccctg tgaaaagtat 150 0 
attcagatgt taatgccact tgaagaagtt 1560 
ctgtcataaa ggactcagca accagccagg 152 0 
caagttcttt tgaacatgac ccttagtaac 1680 
cacatcacat cttactgtcg aaattaaatg 1740 
tccttttcat tctatttcat ttataaaaca 1800 
atgacccact actgggtttg gatccacagt 1860 
agaccatcct gtttaagttt attcttataa 1920 
aatatccaat acccagagtc ctgctctcag 1980 
ccagtaaaag acatagacca atggggagga 2 04 0 
ccatcctagt caacactgga tccacctagt 2100 
gcttgtatta gttggtagct tttaaaaaat 2160 
ctctagcctg gtttcctaga ttgagaacta 222 0 
ctctcctaca aaagctcaag cctgaagata 2280 
cagtgtatta gaaaagaggt catgcagctt 2340 
ttgaaattgt aattcttttc agctattaaa 2400 
aaaaaa 243 6 



<400> 28 

ttcttttcct ctcttcttct ttcgcggttc 
ggtttcctgc cacacttcca acatttcgcc 
gagcaatgga cctccagggc tgtagtgccc 
ggggcgcctg gccagcactc gggttttgaa 
tgccttgaaa aagcagtggc agcactggat 
ggtttagcag ccactgtaac tattacccat 
atggatgatg tgtatggagg tacaaacagg 
ttaaagattt cttttgttga ttgttccaaa 
gaaaccaagc ttgtttggat cgaaaccccc 
gaaggctgtg cacatattgt ccataagcat 
tttatgtcac catatttcca gcgccctttg 
gcaacaaaat acatgaatgg ccacagtgat 
gaaagccttc ataatagact tcgtttcttg 
attgattgtt acctctgcaa tcgaggtctg 
ttcaaaaacg gaatggcagt tgcccagttc 
atttatcctg ggctgccctc tcatccacag 
tgtacaggga tggtcacctt ttatattaag 
aagaacctaa agctatttac tctggccgag 
cttccggcaa tcatgactca tgcatcagtt 
agtgacacac tgattcgact ttctgtgggc 
ctagatcaag ctttgaaggc agcacaccct 
gctgctatta gaagctgctt cctgtgaaga 
aatgag 



agcatgcagg aaaaagacgc ctcctcacaa 60 
acgcaggcga tccatgtggg ccaggatccg 12 0 
cccatctcac tgtccaccac gttcaagcaa 180 
tatagccgtt ctggaaatcc cactaggaat 240 
ggggctaagt actgtttggc ctttgcttca 300 
cttttaaaag caggagacca aattatttgt 360 
tacttcaggc aagtggcatc tgaatttgga 42 0 
atcaaattac tagaggcagc aattacacca 480 
acaaacccca cccagaaggt gattgacatt 540 
ggagacatta ttttggtcgt ggataacact 60 0 
gctctgggag ctgatatttc tatgtattct 66 0 
gttgtaatgg gcctggtgtc tgttaattgt 72 0 
caaaactctc ttggagcagt tccatctcct 78 0 
aagactctac atgtccgaat ggaaaagcat 840 
ctggaatcta atccttgggt agaaaaggtt 900 
catgagttgg tgaagcgtca gtgtacaggt 96 0 
ggcactcttc agcatgctga gattttcctc 1020 
agcttgggag gattcgaaag ccttgctgag 1080 
cttaagaatg acagagatgt ccttggaatt 1140 
ttagaggatg aggaagacct actggaagat 12 0 0 
ccaagtggaa ttcacagcta gtattccaga 12 6 0 
tcaatcttcc tgagtaatta atggaccaac 1320 
1326 



<210> 29 
<211> 49 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : PGR product 



<400> 29 

cccacggtcg gggtacctgg gcgggacgcg ccaggccgac tcccggcga 



49 



<210> 30 

<211> 3464 

<212> DNA 

<213> Homo sapiens 



<400> 30 

tttaatggac acataattta attatatatt ttttcttaca gatacccagg tgttctctct 60 
gatgtccagg aggagaaagg cattaagtac aaatttgaag tatatgagaa gaatgattaa 120 
tatgaaggtg ttttctagtt taagttgttc cccctccctc tgaaaaaagt atgtattttt 180 
acattagaaa aggttttttg ttgactttag atctataatt atttctaagc aactagtttt 240 
tattccccac tactcttgtc tctatcagat accatttatg agacattctt gctataacta 300 
agtgcttctc caagacccca actgagtccc cagcacctgc tacagtgagc tgccattcca 360 
cacccatcac atgtggcact cttgccagtc cttgacattg tcgggctttt cacatgttgg 42 0 
taatatttat taaagatgaa gatccacata cccttcaact gagcagtttc actagtggaa 480 
ataccaaaag cttcctacgt gtatatccag aggtttgtag ataaatgttg ccaccttgtt 540 
tgtaacagtg aaaaattgaa aacaacctgg aagtccagtg atgggaaaat gagtatgttt 600 
ctgtcttaga ttggggaacc caaagcagat tgcaagactg aaatttcagt gaaagcagtg 660 
tatttgctag gtcataccag aaatcatcaa ttgaggtacg gagaaactga actgagaagg 72 0 
taagaaaagc aatttaaagt cagcgagcag gttctcattg ataacaagct ccatactgct 780 
gagatacagg gaaatggagg ggggaaagct ggagtattga tcccgccccc ctccttggtt 840 
gtcagctccc tgtcctgtgt gtgggcggaa catagtccag ctgctctata gcaagtctca 900 
ggtgtttgca gtaagaagct gctggcatgc acgggaacag tgaatgccaa acacttaaag 960 
caattcgatg tttaagtatg taagttcttt tttttttaga cagcgtttcg ctcttgttgc 1020 
ccaggctagc atgcaatggt gtgacctcgg cttactgcaa cctccgcctt cccagattca 1080 
agcgattctc ctgcctcagg ctcccaagta gctaggacca ggtgcgcgcc accacgcccg 114 0 
gctaattttt gtattttgta tttttagtag agatggggtt tcaccatgtt ggtcaggcta 12 0 0 
gtctcgaact cgtgaccgca agcgattcac ccacctcagc ctcccaaagt gctgggatta 1260 
ccggcttgag ccaccacacc cggcacatct tcattctttt tatgtagtaa aaagtataag 1320 
gccacacatg gtttatttga agtattttat aatttaaaaa aatacagaag caggaaaacc 1380 
aattataagt tcaagtgagg gatgatggtt gcttgaacca aagggttgca tgtagtaaga 1440 
aattgtgatt taagatatat tttaaagtta taagtagcag gatattctga tggagtttga 1500 
ctttggtttt gggcccaggg agtttcagat gcctttgaga aatgaatgaa gtagagagaa 156 0 
aataaaagaa aaaccagcca ggcacagtgg ctcacacctg taatcccagc gctttgggag 162 0 
gctaaggcag gcagatcact tgagaccagc ttgggcaaca tggcaaagcc ccatctctac 1680 
aaaaaacaca aaaattagct gggcattgtg gcgcacacct gtattcccat ctagtcagga 1740 
agctgagatg gaagaattaa ttgagcccac gagttcaagg ctgcagtgag tcgtgattgt 1800 
gccactgcac tccagccggg gtgacagaag agaccttgtc tcgaaaacga atctgaaaac 1860 
aatggaacca tgccttcata attctagaaa gttattttca actgataaat ctatattcac 192 0 
ccaaataatc aagggtgaag gtaaaataat acatttttag acaagcaaag actcaggggt 1980 
tacctccatg tgcccttttt agggaagctg ttggagaaaa tactccagca aaatgaagga 2040 
gtacacaaac cagagaatga catgaatcca gcaaatagga tccaacacag gcaatattcc 2100 
agctatggag ctagctttaa aaaggaacag taaaaatatt aatcggttag ctgggtggaa 2160 
tggcccatgc ctgtagtccc agctactcag gaggctcagc agcaggacga cttgagccca 22 2 0 
agagttccag accagcctgg ccaccttagt gagatccctt ctcttaaaaa taataactta 2280 
ttgccagatt tggggcattt ggaaagaagt tcattgaaga taaagcaaaa gtaaaaaaaa 2340 
aaaaaaaaaa aacaagggga aagggttggt taggcaatca ttctagggca gaaagaagta 2400 
caggatagga agagcataat acactgtttt tctcaacaag gagcagtatg tacacagtca 2460 
taatgatgtg actgcttagc ccctaaatat ggtaactact ctgggacaat atgggaggaa 252 0 
aagtgaagat tgtgatggtg taagagctaa tcctcatctg tcatatccag aaatcactat 2580 
ataatatata ataatgaaat gactaagtta tgtgaggaaa aaaacagaag acattgctaa 2640 
aagagttaaa agtcattgct ctggagaatt aggagggatg gggcagggga ctgttaggat 27 00 
gcattataaa ctgaaaagcc tttttaaaat tttatgtatt aatatatgca ttcacttgaa 2760 
aaactaaaaa aaaacaataa tttggaaaaa cccatgaagg taactaacgg aaggaaaaac 2820 
taagagaatg aaaagtattt gcctctggaa agaacaactg gcaggactgt tgttttcatt 2880 



gtaagacttt tggagccatt taattgtact 
acaattccat cttaataaag agttacactt 
tgtacacccc acacaaaatt tcaaagaaac 
taaaaataga gaaaggaaaa catagaatta 
gtttgttagg aaacaatcaa aatcaagtct 
actgtaaggt gataacttgg ggcaaacatg 
aggttagcac attttatgtt tctgtgagat 
aaaggcaatg tttctgaaaa tgttgtacct 
atctgccttt taaacatctc agataatctg 
ttgcactcta aggaagaaaa aaacaagttt 



taaccatttt catctatttc tttaataaga 2940 
gttaataagt gctggcctcc tgttgttctt 3000 
tttgatggca atatatctcc atggtcagct 3 0 50 
gccaagagtc acacaaaaca aagatcagtt 3120 
cactttttcc agattggctt atggaacagc 3180 
taaataataa aacatatgtt ttaaatattc 3240 
taaaattgtg tgtgacatac ccgcttcctt 33 00 
gctattcctg aatcagggat gggtcccaga 3360 
aagcctgctt aagtttgtaa ggcactgctt 3420 
taattcccgt ctct 3464 



<210> 31 

<211> 1584 

<212> DNA 

<213> Homo sapiens 



<400> 31 

cggggcagct ctgaggaaca aggtggaagc 
tgggctggtg ctggcagtgg gagccgtggc 
gatggcctgg tttgccctct acctcctgag 
ccagacccag agttcatgct ccgttccctc 
agtactcatg gagaactcgg tgacttcatc 
catgaatctg gccggagcct acaacttgaa 
gtccagcgac aacaacgatc taaccattgg 
ctcctcctgc cgagaccctg gggataaagt 
ggcaccttcc agccccaacg ctgaagcatc 
ggcactgtgc cagaagaact ctgaggcgac 
cctgctggcc aactcctctc ccttcaatgt 
gacctgtatg tacaacaaga tccctgtagg 
tcaggtacta aaggatattg tggagaaaat 
tggagacatc tacagtactg gcctcgccat 
taaaaaggaa tggaactgca agaagactac 
gaaattccac aaccccatgt ccattgctca 
cctagatgtg ccccaggtca cttgtagtcc 
caaccctggc cctggcccca cctctgcatc 
ccagctgagg ggggttgagc tgctcttcaa 
gtcagtgtta cttgttgtcc tagaggaagc 
aaccacaatg acatcttggg gccttgtcgt 
taatcacaag acatactggc agtttcttag 
tgactacata cccttcaacc acgagcacat 
aggtgggttc agcttctatc aaacatctcc 
tttaaatcta tgcaaaaaag cgaatgcctg 
gagaaccact atgtagaata aaaatgcaaa 
tgaaaataaa attttcatct tctc 



tcagagcgct ggtctccacc ctggtgcccc 60 
tgtggatgag agacatagac gagagagtga 120 
ccttctctgg gctacagctg ggactagtac 180 
agcacaggag cccttggtca atggaataca 240 
agcctaccca aaccccagca tcctgattgc 3 00 
ggcccagaag ctcctgactt accagctcat 3 60 
gcacctcggc ctcaccatca tggccctcac 420 
atccattcta caaagacaaa tggagaactg 480 
agccttctat gggcccagtc tagcgatctt 540 
cttgccgata gccgtccgct ttgccaagac 600 
agacacagga gcaatggcaa ccttggctct 660 
ttcagaggaa ggttacagat ccctgtttgg 72 0 
cagcatgaag atcaaagata atggcatcat 780 
gcaggctctc tctgtaacac ctgagccatc 840 
ggatatgata ctcaatgaga ttaagcaggg 900 
aatcctccct tccctgaaag gcaagacata 960 
tgatcatgag gtacaaccaa ctctacccag 102 0 
taacatcact gtcatataca ccataaataa 1080 
cgagaccatc aatgttagtg tgaaaagtgg 1140 
acagcgcaaa aatcctatgt tcaaatttga 12 00 
ctcttctatc aacaatatcg cggaaaatgt 1260 
tggtgtaaca cctttgaatg aaggggttgc 132 0 
cacagccaat ttcacacagt actaacgaag 13 80 
aaaggatggg tgaaattttt tccacttcat 1440 
tgatgctacc atattcctgg taaaaacatg 15 0 0 
gttcactgga gtctcaacat ctatgactca 1560 
1584 



<210> 32 

<211> 1537 

<212> DNA 

<213> Homo sapiens 



<400> 32 

gctctcatta ccttctgccc atcacttaat 
tacactgttg gagagatgag acagtcacac 
tcttttattc caagccaact atgcgagatt 
ctaaaacctc tgttgaatac aatgatccag 
aatgttgtgt tgtccctcaa acttgttgga 
atccaacaaa tcaaatacaa tgtgaaaagc 



aaatagccag ccaattcatc aacattctgg 60 
cagctgcccc tagtggggct cttactgttt 12 0 
tgtgaggtaa gtgaagaaaa ctacatccgc 18 0 
tcaaactata acaggggaac cagcgctgtc 24 0 
atccagatcc aaaccctgat gcaaaagatg 3 00 
agattgtcag atgtaagctc gggagagctt 360 



gccttgatta tactggcttt gggagtatgt 
taccacctga ctgacaagct agaaaataaa 
cacaatggca ctcccctgac taactactac 
ctgttcaatg ggaactactc aaccgccgaa 
aactattatt ttggtagcca gttctcagta 
acctgtgtga agaagagtct aataaatggg 
aacatcagta tttatacaaa gtcactggta 
ggtctcattg gaaacacatt tagcacagga 
gactattata atgaaaatga ctggaattgc 
atttctcaag gagcattcag taatccaaac 
ggaaagacct tcttggatat taacaaagac 
aacatctccg ctgatgagcc tataactgtg 
gtcaattact ctgtgagaat caatgaaaca 
tctgtcttcc tcagtgtgat ggagaaagcc 
acaatggagg agcgctcatg ggggccctat 
aataatgaca gaacctactg ggaacttctg 
ggtagttacg ttgtccgcaa tggagaaaac 
gcccaaactt tcctcagctg cataaaatcc 
ttatgccttc ttcttcattt atcccagtac 
tctctacatg ttcaataaaa gttgttgaaa 



cgtaacgctg aggaaaactt aatatatgat 420 
ttccaagcag aaattgaaaa tatggaagca 480 
cagctcagcc tggacgtttt ggccttgtgt 540 
gttgtcaacc acttcactcc tgaaaataaa 600 
gatactggtg caatggctgt cctggctctg 660 
cagatcaaag cagatgaagg cagtttaaag 720 
gaaaagattc tgtctgagaa aaaagaaaat 7 80 
gaagccatgc aggccctctt tgtatcatca 840 
caacaaactc tgaatacagt gctcacggaa 900 
gctgcagccc aggtcttacc tgccctgatg 960 
tcttcttgcg tctctgcttc aggtaacttc 1020 
acacctcctg actcacaatc atatatctcc 1080 
tatttcacca atgtcactgt gctaaatggt 1140 
cagaaaatga atgatactat atttggtttc 1200 
atcacctgta ttcagggcct atgtgccaac 1260 
agtggaggcg aaccactgag ccaaggagct 132 0 
ttggaggttc gctggagcaa atactaataa 13 8 0 
atttgcagtg gagttccatg tttattgtcc 1440 
gagcaggaga gttaataacc tccccttctc 1500 
gattaac 1537 



<210> 33 

<211> 1866 

<212> DNA 

<213> Homo sapiens 



<400> 33 

ccgattcttg ctcactgctc acccacctgc 
cttccttctg ggggtcctgg gggccctcac 
ccatctggta gagaagttgg gccagcacct 
gcacttgaac cccagcatct atgtgggcct 
ggaagacctc tacctgcaca gcctcaagct 
cttcagcgag gatgacggtg actgccaggg 
cctgctcgct ctcagagcca actgtgagtt 
ctcacagctc aaatggttcc tggaggatga 
ccacccccac actagctact accagtatgg 
gaagcgggtc catgacagcg tggtggacaa 
gggccaccat tctgtggaca cagcagccat 
ctcaaacttc aaccctggtc ggagacaacg 
ggagatcttg aaggcccaga cccccgaggg 
ggcattacag ttcctcatga cttcccccat 
caaggcgagg gttgctttgc tggccagtct 
gatttcccag ctgctgcccg ttctgaacca 
ctgtctggca ccacgagtca tgttggaacc 
gatcatcagt gtcacgctgc aggtgcttag 
tgttctggcc gggtccaccg tggaagatgt 
cacatatgaa acacaggcct cctcgtcagg 
ggccggagaa agggagttct ggcagcttct 
tattgctgac tacagaccca aggatggaga 
gcccctgagc tccctcatcc cagcagcctc 
tgatgtccct ggaacaggaa ctcgcctgac 
atgccccctg ggatcacccc agccacaagc 
tggagcagag agccaagcat cttccctggg 
gccctgcagg tctcccatga aggccacccc 
tccttggcaa aaaacggagt ccgcaggccg 
ttggggtcct gcaagaaggc ctcctcagcc 
actctgctgt tagagtggca gctctgagct 



tgctgccatg aggcaccttg gggccttcct 60 
tgagatgtgt gaaataccag agatggacag 12 0 
cttaccttgg atggaccggc tttccctgga 18 0 
acgcctctcc agtctgcagg ctgggaccaa 240 
tggttaccag cagtgcctcc tagggtctgc 300 
caagccttcc atgggccagc tggccctcta 360 
tgtcaggggc cacaaggggg acaggctggt 42 0 
gaagagagcc attgggcatg atcacaaggg 48 0 
cctgggcatt ctggccctgt gtctccacca 54 0 
acttctgtat gctgtggaac ctttccacca 600 
ggcaggcttg gcattcacct gtctgaagcg 660 
gatcaccatg gccatcagaa cagtgcgaga 72 0 
ccactttggg aatgtctaca gcaccccatt 780 
gcctggggca gaactgggaa cagcatgtct 840 
gcaggatgga gccttccaga atgctctcat 900 
caagacctac attgatctga tcttcccaga 960 
agctgctgag accattcctc agacccaaga 1020 
tctcttgccg ccgtacagac agtccatctc 1080 
cctgaagaag gcccatgagt taggaggatt 1140 
cccctactta acctccgtga tggggaaagc 12 00 
ccgagacccc aacaccccac tgttgcaagg 12 60 
aaccattgag ctgaggctgg ttagctggta 1320 
gcacactccc taggcttcta ccctccctcc 1380 
cctgctgcca cctcctgtgc actttgagca 1440 
ccttcgaggg ccctatacca tggcccacct 15 00 
aagtctttct ggccaagtct ggccagcctg 1560 
atggtctgat gggcatgaag catctcagac 162 0 
caggtgttgt gaagaccact cgttctgtgg 1680 
cgggggctat ggccctgacc ccagctctcc 1740 
ggttgtggca cagtagctgg ggagacctca 18 00 



gcagggctgc tcagtgcctg cctctgacaa aattaaagca ttgatggcct gtggacctgc 1850 
aaaaaa 1856 



<210> 34 

<211> 2798 

<212> DNA 

<213> Homo sapiens 



<400> 34 

gccctctccc acagcggagt ccaaaacagg 
gtttccatgc tccaccatgt taagagctaa 
cctgaggcag gtaaaagaat catcaggctc 
gcaacagccc cttcacccag aatgggctgc 
cccagaagac ctaatatggc acaccccgga 
gagagatact atggacttac ctgaagaact 
atatcctacc atgtatacct ttaggccctg 
tgtggaagaa agcaataagt tctataagga 
agttgccttt gatctggcga cacatcgtgg 
tgatgttgga atggctggag ttgctattga 
tggaattcct ttagaaaaaa tgtcagtttc 
tcttgcaaat tttatagtaa ctggagaaga 
taccatccaa aatgatatac taaaggaatt 
agaaccatcc atgaaaatta ttgctgacat 
atttaattca atttcaatta gtggatacca 
ggagctggcc tatactttag cagatggatt 
cctgacaatt gatgaatttg caccaaggtt 
ctatatggaa atagcaaaga tgagagctgg 
aatgtttcag cctaaaaact caaaatctct 
atggtcactt actgagcagg atccctacaa 
ggcagcagta tttggaggga ctcagtcttt 
tttgccaact gtgaaaagtg ctcgaattgc 
atctgggatt cccaaagtgg ctgatccttg 
aaatgatgtt tatgatgctg ctttaaagct 
ggccaaagct gtagctgagg gaatacctaa 
acaagctaga atagattctg gttctgaagt 
aaaagaagac gctgtagaag ttctggcaat 
tgaaaaactt aagaagatca aatccagcag 
tgcactaacc gaatgtgctg ctagcggaga 
atctcgggca agatgtacag tgggagaaat 
acataaagcg aatgatcgaa tggtgagtgg 
agagataaca tctgctatca agagggttca 
tcgtcttctt gtagcaaaaa tgggacaaga 
tacaggattt gctgatcttg gttttgatgt 
tgaagtggcc cagcaggctg tggatgcgga 
tgctggtcat aaaaccctag ttcctgaact 
agatattctt gtcatgtgtg gaggggtgat 
agttggtgtt tccaatgtat ttggtcctgg 
gcttgatgat attgagaagt gtttggaaaa 
tgttttagct tttgtctaaa atattatttt 
gtcttcaatt taatttcaat acctgatttg 
ccttacttat aggcctggtg tcatgctata 
aaaaaaaaat ccctaaaaac tctctatact 
gacaatggta ttatttttaa aaatcatggt 
ctctttcatt tttatattaa gaattaaact 
tctcagttta gcattacatt gtcttgagca 
acctatcttg aaaaaaaaaa aaaaaaaaaa 



cctaccagtc agttcttatt tctattgggt 50 
gaatcagctt tttttacttt cacctcatta 120 
caggctcata cagcaacgac ttctacacca 180 
cctggctaaa aagcagctga aaggcaaaaa 240 
agggatctct ataaaaccct tgtattccaa 3 00 
tccaggagtg aagccattca cacgtggacc 3 60 
gaccatccgc cagtatgctg gttttagtac 42 0 
caacattaag gctggtcagc agggattatc 480 
ctatgattca gacaaccctc gagttcgtgg 540 
cactgtggaa gataccaaaa ttctttttga 600 
catgactatg aatggagcag ttattccagt 550 
acaaggtgta cctaaagaga aacttactgg 720 
tatggttcga aatacataca tttttcctcc 780 
atttgaatat acagcaaagc acatgccaaa 840 
tatgcaggaa gcaggggctg atgccattct 900 
ggagtactct agaactggac tccaggctgg 950 
gtctttcttc tggggaattg gaatgaattt 102 0 
tagaagactc tgggctcact taatagagaa 1080 
tcttctaaga gcacactgtc agacatctgg 1140 
taatattgtc cgtactgcaa tagaagcaat 1200 
gcacacaaat tcttttgatg aagctttggg 12 6 0 
caggaacaca caaatcatca ttcaagaaga 132 0 
gggaggttct tacatgatgg aatgtctcac 1380 
cattaatgaa attgaagaaa tgggtggaat 1440 
acttcgaatt gaagaatgtg ctgcccgaag 1500 
aattgttgga gtaaataagt accagttgga 1560 
tgataatact tcagtgcgaa acaggcagat 162 0 
ggatcaagct ttggctgaac attgtcttgc 168 0 
tggaaatatc ctggctcttg cagtggatgc 174 0 
cacagatgcc ctgaaaaagg tatttggtga 1800 
agcatatcgc caggaatttg gagaaagtaa 1860 
taaattcatg gaacgtgaag gtcgcagacc 192 0 
tggccatgac agaggagcaa aagttattgc 1980 
ggacataggc cctcttttcc agactcctcg 2040 
tgtgcatgct gtgggcgtaa gcaccctcgc 2100 
catcaaagaa cttaactccc ttggacggcc 2160 
accacctcag gattatgaat ttctgtttga 2220 
gactcgaatt ccaaaggctg ccgttcaggt 2280 
gaagcagcaa tctgtataat atcctctttt 2340 
agttatgatc aaagaagaga gtaaagctat 240 0 
tactttcctt gaaagcttta ctttaaaata 2460 
agtatgtaca tacagtttca cttcaaaaat 252 0 
ctctataaca atactttatc aagaactctg 2580 
gatgtattta ttagaatgtt tcttataaat 2640 
gtacctaaaa aaactctgac tattcccatt 2700 
ccagaaaata aaatccatat attaattaaa 2760 
aaaaaaaa 27 98 



<210> 35 

<211> 1637 

<212> DNA 

<213> Homo sapiens 



<400> 35 

aagaactggc ctgtacattt tcaaggaatt 
ccaaacactc cattgggatc ctagctgttt 
tcttgagctg ccggctgaca cagtgcagcg 
ggatgagagg gtggctctcc acctagatga 
cttttatatt cccaaaatac aggatctgcc 
tgaaaatgcc atctatttct tgggaaattc 
atatcttgaa gaagaactag ataagtgggc 
gaagcgtcct tggattacag gagatgagag 
agccaatgag aaagaaatag ccctaatgaa 
gttatcattt tttaagccta cgccaaaacg 
cccttctgat cattatgcta ttgagtcaca 
aagtatgcgg atgataaagc caagagaggg 
tgaagtaatt gagaaggaag gagactcaat 
ttacactgga cagcacttta atattcctgc 
ttatgttggc tttgatctag cacatgcagt 
gggagttgat tttgcctgct ggtgttccta 
tgctggtgcc ttcattcatg aaaagcatgc 
gtttggccat gaactcagca ccagatttaa 
ggtctgtgga ttccgaattt caaatcctcc 
tttagagatc tttaagcaag cgacaatgaa 
tggctatctg gaatacctga tcaagcataa 
accagttgtg aacataatta ctccgtctca 
aacattttct gttccaaaca aagatgtttt 
tgacaagcgg aatccaaatg gcattcgagt 
tgatgtttat aaatttacca atctgctcac 
ttagcagtgt tttctagaac aacttaagca 
agtattattc gatttttaat tattgaaagt 
taaataatat accttac 



cttgagaggt tcttggagag attctgggag 60 
tagagaacaa cttgtaatgg agccttcatc 120 
cattgcggct gaactcaaat gccacccaac 180 
ggaagataag ctgaggcact tcagggagtg 240 
tccagttgat ttatcattag tgaataaaga 3 00 
tcttggcctt caaccaaaaa tggttaaaac 3 60 
caaaatagca gcctatggtc atgaagtggg 42 0 
tattgtaggc cttatgaagg acattgtagg 480 
tgctttgact gtaaatttac atcttctaat 540 
atataaaatt cttctagaag ccaaagcctt 600 
actacaactt cacggactta acattgaaga 660 
ggaagaaacc ttaagaatag aggatatcct 720 
tgcagtgatc ctgttcagtg gggtgcattt 780 
catcacaaaa gctggacaag cgaagggttg 840 
tggaaatgtt gaactctact tacatgactg 900 
caagtattta aatgcaggag caggaggaat 960 
ccatacgatt aaacctgcat tagtgggatg 102 0 
gatggataac aaactgcagt taatccctgg 1080 
cattttgttg gtctgttcct tgcatgctag 1140 
ggcattgcgg aaaaaatctg ttttgctaac 12 0 0 
ctatggcaaa gataaagcag caaccaagaa 12 60 
tgtagaggag cgggggtgcc agctaacaat 132 0 
ccaagaacta gaaaaaagag gagtggtttg 13 80 
ggctccagtt cctctctata attctttcca 1440 
ttctatactt gactctgcag aaacaaaaaa 1500 
aattatactg aaagctgctg tggttatttc 1560 
atgtcaccat tgaccacatg taactaacaa 162 0 
1637 



<210> 36 

<211> 1908 

<212> DNA 

<213> Homo sapiens 



<400> 36 

gaattcatga aaacgtagct cgtcctcaaa 
gagaaatata tacgaaagga acaagatttt 
cacggtaggt ggctaaacac cagtcttcaa 
ctgtgcaggt gacccaagtg aggggtcacc 
tttaaaaaca aacaaacgta cttattgcgt 
ttagtcacat ctataccatc ctaagaaact 
taatctaaac tacaaaaagt gttcactggg 
agtgatttca aatattgagc catgctgttg 
ctttcttcta gtcacccaat ccagcacttt 
caaggaaaaa aaaaaaggat ggaggttaaa 
tcagaatggg agtcaggaga cctgagttcc 
aacctttgcc tatctcctca aacggaagta 
gccccggctg ggctcttccc accttcccct 
tatcatcggg cggagggtcc ccgcctccgc 
ccccgatggc cctgcccagt cccagacaga 
gaaggcgggc gctgggggcg ctgcggccgc 



aaaaacagaa gaggagtaat cattttaagg 60 

gaagcaccca agctgccacc tacattaaaa 12 0 

tgcccttcca cagcctcagt ctgaaaaata 180 

cttgggcttt tcctgtggca gtatctctgg 240 

tgaaggacgg caacaggaag gactccatga 300 

ttatccaccc aaactgtatt tcagacttta 360 

gaactgcaca atatgactgc ttttaaccgt 42 0 

cagtcttaaa aactggagac ctaagggcag 48 0 

tttaaaaaat cagtaaaact cttcgaccac 540 

agacgcaccc cttgcccaca agccccctca 60 0 

tgtctcaggc ctgccattaa aaacctgcat 66 0 

ctaaaacctc agcgcttcac ccaatttgta 720 

tcttcagccc gccccttcct cctccagccc 780 

ccgccttacc cacaagcccc gcccccccag 840 

acctactacg tgcggcggca gctggggcgg 900 

tgcagcgcag ggtccacctg gtcggctgca 960 



cctgtggagg aggaggtgga tttcaggctt 
gcttgcctcg caggggctga gctggaggca 
gacatggcag ggcaaggatg gcagcccggc 
ggccgcagtt cccaggcgtc tgcgggcgcg 
gggggggcgg ggcctcgcct gcacaaatag 
cgccaaactt gaccgcgcgt tctgctgtaa 
tcatggttgg ttcgctaaac tgcatcgtcg 
acggggacct gccctggcca ccgctcaggt 
cgggcgcagg ctgcccacgg tcggggtacc 
agaggatggg gccagacttg cggtctgcgc 
ccttttctgc tgcgcgggag gcccagttgc 
aggtcttgcc ctgcggcgcc ctcgcccagg 
acccctaccc acagcgctcc gtttgtcagg 
tttcgagtaa cgctgtttct ctaacttgta 
ccacaacctc ttcagtagaa ggtaatgtgg 
accagtgcaa atgttagtta aatggaaagt 

<210> 37 
<211> 30 
<212> DNA 

<213> Artificial Sequence 



cccgtagact ggaagaatcg gctcaaaacc 1020 
gcgaggccgc ccgacgcagg cttccggcga 1080 
ggcagggccc ggcgaggagc gcgaacccgc 1140 
agcacgccgc gaccctgcgt gcgccggggc 1200 
ggacgagggg gcggggcggc cacaatttcg 1260 
cgagcgggct cggaggtcct cccgctgctg 1320 
ctgtgtccca gaacatgggc atcggcaaga 1380 
atctgccggg ccggggcgat gggacccaaa 1440 
tgggcgggac gcgccggccg actcccggcg 1500 
tggcaggaag ggtgggcccg actggattcc 1560 
tgatttctgc ccggattctg ctgcccggtg 162 0 
gcaaagtccc agccctggag aaaacacctc 1680 
tgccttagag ctcgagccca agggataatg 1740 
ggaatgaatt cagatatttc cagagaatga 1800 
gattaagtag ggtcttgctt gatgaagttt 1860 
tttccgtgtt aatctggg 1908 



<220> 

<223> Description of Artificial Sequence : primer 
<400> 37 

cccacggtcg gggtggccga ctcccggcga 30 



<210> 38 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : primer 
<400> 38 

ctaaactgca tcgtcgctgt g 21 



<210> 39 
<211> 19 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : primer 
<400> 39 

aaaaggggaa tccagtcgg 19 



<210> 40 
<211> 19 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: PGR product 



<400> 40 

acctgggcgg gacgcgcca 



19 



<210> 41 

<211> 1275 

<212> DNA 

<213> Homo sapiens 



<400> 41 

ctgcagcgcc agggtccacc tggtcggctg 
ttcccgtaga ctggaagaat cggctcaaaa 
cagcgaggcc gcccgacgca ggcttccggc 
gcggcagggc ccggcgagga gcgcgaaccc 
cgagcacgcc gcgaccctgc gtgcgccggg 
agggacgagg gggcggggcg gccacaattt 
aacgagcggg ctcggaggtc ctcccgctgc 
cgctgtgtcc cagaacatgg gcatcggcaa 
gtatctgccg ggccggggcg atgggaccca 
cctgggcggg acgcgccagg ccgactcccg 
cgctggcagg aagggtgggc ccgactggat 
tgctgatttc tgcccggatt ctgctgcccg 
cagggcaaag tcccagccct ggagaaaaca 
caggtgcctt agagctcgag cccaagggat 
tgtaggaatg aattcagata tttccagaga 
gtgggattaa gtagggtctt gcttgatgaa 
aagttttccg tgttaatctg ggaccttttc 
agttcccaag gttcatttac cattattaaa 
caacgcacga gcaaattatc aggcatgggg 
ggaggttagc actccgaaag gaaaacagag 
aaggcctgaa caagggcagt ggagaagaga 
tggattttgg agate 



cacctgtgga ggaggaggtg gatttcaggc 6 0 
ccgcttgcct cgcaggggct gagctggagg 120 
gagacatggc agggcaagga tggcagcccg 180 
gcggccgcag ttcccaggcg tctgcgggcg 240 
gcgggggggc ggggcctcgc ctgcacaaat 3 00 
cgcgccaaac ttgaccgcgc gttctgctgt 360 
tgtcatggtt ggttcgctaa actgcatcgt 420 
gaacggggac ctgccctggc caccgctcag 480 
aacgggcgca ggctgcccac ggtcggggta 540 
gcgagaggat ggggccagac ttgcggtctg 600 
tccccttttc tgctgcgcgg gaggcccagt 660 
gtgaggtctt tgccctgcgg cgccctcgcc 720 
cctcacccct acccacagcg ctccgtttgt 780 
aatgtttcga gtaacgctgt ttctctaact 840 
atgaccacaa cctcttcagt agaaggtaat 9 00 
gtttaccagt gcaaatgtta gttaaatgga 960 
tcttattatg gatctgtatg atctgtatgc 1020 
aaatttttgt cttagaaatt ttatgtatgt 1080 
cagaattggc aactgggtgg aggcttcggt 1140 
taggcctttg gaacagctgc tggaagagat 1200 
gggtaaaaat tttttaaggt tacatgaccc 12 6 0 
1275 



<210> 42 

<211> 1256 

<212> DNA 

<213> Homo sapiens 



<400> 42 

ctgcagcgcc agggtccacc tggtcggctg 
ttcccgtaga ctggaagaat cggctcaaaa 
cagcgaggcc gcccgacgca ggcttccggc 
gcggcagggc ccggcgagga gcgcgaaccc 
cgagcacgcc gcgaccctgc gtgcgccggg 
agggacgagg gggcggggcg gccacaattt 
aacgagcggg ctcggaggtc ctcccgctgc 
cgctgtgtcc cagaacatgg gcatcggcaa 
gtatctgccg ggccggggcg atgggaccca 
gccgactccc ggcgagagga tggggccaga 
cccgactgga ttcccctttt ctgctgcgcg 
tctgctgccc ggtgaggtct ttgccctgcg 
tggagaaaac acctcacccc tacccacagc 
gcccaaggga taatgtttcg agtaacgctg 
atttccagag aatgaccaca acctcttcag 
tgcttgatga agtttaccag tgcaaatgtt 
gggacctttt ctcttattat ggatctgtat 
ccattattaa aaaatttttg tcttagaaat 



cacctgtgga ggaggaggtg gatttcaggc 60 
ccgcttgcct cgcaggggct gagctggagg 12 0 
gagacatggc agggcaagga tggcagcccg 18 0 
gcggccgcag ttcccaggcg tctgcgggcg 240 
gcgggggggc ggggcctcgc ctgcacaaat 300 
cgcgccaaac ttgaccgcgc gttctgctgt 360 
tgtcatggtt ggttcgctaa actgcatcgt 42 0 
gaacggggac ctgccctggc caccgctcag 480 
aacgggcgca ggctgcccac ggtcggggtg 540 
cttgcggtct gcgctggcag gaagggtggg 600 
ggaggcccag ttgctgattt ctgcccggat 660 
gcgccctcgc ccagggcaaa gtcccagccc 72 0 
gctccgtttg tcaggtgcct tagagctcga 780 
tttctctaac ttgtaggaat gaattcagat 840 
tagaaggtaa tgtgggatta agtagggtct 900 
agttaaatgg aaagttttcc gtgttaatct 960 
gatctgtatg cagttcccaa ggttcattta 1020 
tttatgtatg tcaacgcacg agcaaattat 1080 



caggcatggg gcagaattgg caactgggtg gaggcttcgg tggaggttag cactccgaaa 114 0 
ggaaaacaga gtaggccttt ggaacagctg ctggaagaga taaggcctga acaagggcag 12 0 0 
tggagaagag agggtaaaaa ttttttaagg ttacatgacc ctggattttg gagatc 125 6 



<210> 43 
<211> 55 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: PGR product 
<400> 43 

gctgcccacg gtcggggtac ctgggcggga cgcgccaggc cgactcccgg cgaga 55 

<210> 44 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PGR product 
<400> 44 

gctgcccacg gtcggggtgg ccgactcccg gcgaga 36 



<210> 45 

<211> 1273 

<212> DNA 

<213> Homo sapiens 



<400> 45 

ctgcagcgca gggtccacct ggtcggctgc acctgtggag gaggaggtgg atttcaggct 60 
tcccgtagac tggaagaatc ggctcaaaac cgcttgcctc gcaggggctg agctggaggc 12 0 
agcgaggccg cccgacgcag gcttccggcg agacatggca gggcaaggat ggcagcccgg 180 
cggcagggcc cggcgaggag cgcgaacccg cggccgcagt tcccaggcgt ctgcgggcgc 240 
gagcacgccg cgaccctgcg tgcgccgggg cgggggggcg gggcctcgcc tgcacaaata 300 
gggacgaggg ggcggggcgg ccacaatttc gcgccaaact tgaccgcgcg ttctgctgta 360 
acgagcgggc tcggaggtcc tcccgctgct gtcatggttg gttcgctaaa ctgcatcgtc 42 0 
gctgtgtccc agaacatggg catcggcaag aacggggacc tgccctggcc accgctcagg 48 0 
tatctgccgg gccggggcga tgggacccaa acgggcgcag gctgcccacg gtcggggtac 540 
ctgggcggga cgcgccggcc gactcccggc gagaggatgg ggccagactt gcggtctgcg 600 
ctggcaggaa gggtgggccc gactggattc cccttttctg ctgcgcggga ggcccagttg 660 
ctgatttctg cccggattct gctgcccggt gaggtctttg ccctgcggcg ccctcgccca 720 
gggcaaagtc ccagccctgg agaaaacacc tcacccctac ccacagcgct ccgtttgtca 780 
ggtgccttag agctcgagcc caagggataa tgtttcgagt aacgctgttt ctctaacttg 840 
taggaatgaa ttcagatatt tccagagaat gaccacaacc tcttcagtag aaggtaatgt 900 
gggattaagt agggtcttgc ttgatgaagt ttaccagtgc aaatgttagt taaatggaaa 960 
gttttccgtg ttaatctggg accttttctc ttattatgga tctgtatgat ctgtatgcag 1020 
ttcccaaggt tcatttacca ttattaaaaa atttttgtct tagaaatttt atgtatgtca 1080 
acgcacgagc aaattatcag gcatggggca gaattggcaa ctgggtggag gcttcggtgg 1140 
aggttagcac tccgaaagga aaacagagta ggcctttgga acagctgctg gaagagataa 12 00 
ggcctgaaca agggcagtgg agaagagagg gtaaaaattt tttaaggtta catgaccctg 1260 
gattttggag ate 1273 



<210> 46 
<211> 18 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PGR product 



<400> 46 

acctgggcgg gacgcgcc 



18 



